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Report  on  US  ARMY  AWARD  DAAD19-00-1-0534 
Optimal  Subband  Coders  for  Cyclostationary  Signals 
PI:  Soura  Dasgupta 
University  of  Iowa 

The  purpose  of  the  award  was  to  study  the  subband  coding  of  cyclostationary  signals. 

Accomplishement  to  date  are  listed  below. 

•  We  had  proposed  to  study  subband  coding  of  cyclostationary  signals  with  uniform 
filter  banks,  under  two  bit  rate  constraints,  (i)  When  the  bit  allocation  among  the 
subband  signals  is  static,  and  (ii)  when  the  bit  allocation  is  periodic  but  the  bit 
budget  across  the  subband  signals  is  constant  at  each  instant  of  time.  Problem  (i) 
was  solved  before  the  award  was  received.  We  have  solved  (ii).  In  addition  we  have 
also  proposed  a  third  criteria  in  which  the  bit  budget  is  defined  as  a  fixed  average 
over  a  period  of  the  signal  cycle.  We  show  that  though  the  optimizing  filter  bank  is 
the  same  for  all  three,  and  have  characterized  this  filter  bank,  the  new  criteria  leads 
to  a  higher  coding  gain. 

Figure  1  shows  the  coding  gains  under  the  last  two  schemes,  for  an  input  with  2- 
periodic  statistics.  The  crosses  are  the  second  scheme  and  the  circles  the  third. 
Observe  that  the  last  scheme  leads  to  higher  coding  gains  times  better. 


Coding  Gain  vs  #  of  channels:  Subband  coding  of  WSCS  signal  with  period  2 


#  of  channels 


Figure  1:  Coding  gain  plots. 
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•  Realization  of  these  optimizing  filter  banks  requires  the  notion  of  compaction  filters  in 
the  case  of  stationary  signals.  We  have  introduced  the  concept  of  energy  compaction 
for  cyclostationary  signals  as  well.  We  have  also  generalized  the  notion  of  M- band 
filters  to  cyclostationary  case.  We  have  studied  the  theory  and  design  of  cyclosta¬ 
tionary  energy  compaction  filters  and  showed  how  they  relate  to  the  optmizing  filter 
banks. 

•  We  had  proposed  to  study  the  use  of  nonuniform  filter  banks  for  subband  coding  for 
both  the  stationary  and  cyclostationary  signals.  En  route  to  such  a  study,  we  have 
showed  that  every  biorthogonal  dyadic  filter  bank  has  a  tree  structured  decomposi¬ 
tion.  This  permits  the  optimization  of  such  filter  banks  to  be  conducted  in  a  tree 
structured  framework,  making  the  underlying  task  much  simpler. 

•  A  common  occurrence  of  cyclostationarity  is  in  Orthogonal  Frequency  Division  Mul¬ 
tiplexed  (OFDM)  communications.  We  have  shown  that  certain  channel  resource 
allocation  problems  for  OFDM  systems  are  dual  problems  of  subband  coding.  We 
have  solved  the  optimum  resource  allocation  problem  for  OFDM  in  the  multiuser  en¬ 
vironment.  Specifically,  we  have  considered  in  turn  a  variety  of  settings  culminating 
in  one  in  which  each  user  is  assigned  different  number  of  subchannels  and  different 
bit  rates  and  is  required  to  achieve  differing  symbol  error  rates  and  supports  po¬ 
tentially  different  modulation  schemes.  Our  goal  is  to  select  the  input  and  output 
block  transforms,  the  linear  redundancy  removal  scheme  at  the  receiver,  the  number 
of  bits/symbol  assigned  to  each  subchannel,  and  the  subchannel  assignment  to  each 
user,  in  order  to  achieve  the  QoS  specifications  under  a  zero  ISI  condition  with  min¬ 
imum  transmitted  power.  We  assume  knowledge  of  the  equalized  channel  and  the 
second-order  statistics  of  the  noise  at  the  receiver  input. 

(A)  The  optimum  input/output  block  transforms  are  orthonormal. 

(B)  The  selection  of  the  optimum  input  /output  transforms  and  redundancy  removal 
schemes  depends  only  on  the  channel/interference  conditions,  and  does  not  de¬ 
pend  on  such  service  requirements  as  the  required  bit  rates  and  symbol  error 
rates.  Thus  there  is  a  conceptual  separation  between  the  selection  of  these 
variables,  and  the  remaining  tasks  of  bit  loading  and  subchannel  selection.  In 
practical  terms  this  considerably  simplifies  the  optimzation  problem. 

Figure  2  compares  the  transmitting  power  of  the  DFT  based  DMT  under  no  optimum 
bit  allocation  and  optimum  bit  allocation  with  an  optimum  unitary  transceiver.  We 
assume  the  equalized-channel  to  be  C(z)  =  1  +  0.5z-1,  and  a  noise  source  v(n)  whose 
power  spectral  density  is  shown  in  fig.  1.  We  assume  the  DMT  system  supports  two 
user  services.  Both  services  employ  QAM  modulation  schemes,  and  the  target  rates 
for  the  two  users  are  600  Kbps  and  1  Mbps  respectively.  The  (i,j)  on  the  x-axis  of  the 
plot  indicates  that  user  1  and  2  were  respectively  allocated  i,j  number  of  channels. 
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The  plot  shows  that  there  is  a  10  dB  saving  in  transmit  power  with  onr  design  over 
the  DFT  based  DMT  under  optimum  bit  allocation,  and  a  14  dB  improvement  over 
the  conventional  DMT  with  no  optimum  bit  allocation. 

Power  spectral  density  of  noise 


Figure  2:  Comparison  of  transmit  power  levels. 


•  For  optimum  coding  over  transmission  channels  it  is  imoportant  that  the  transmitter 
be  aware  of  the  channel  it  transmits  over.  While  in  recent  years  many  algorithms  have 
been  proposed  that  assume  such  channel  knowledge  at  the  transmitter,  no  method 
for  making  exists  for  making  the  transmitter  channel  aware.  We  have  proposed  a  new 
feedback  scheme  that  permits  the  transmitter  to  directly  estimate  the  channel. 

•  In  both  subband  coding  and  DMT  bit  loading  is  an  important  problem.  Specifically, 
for  an  ./V-sub  channel  system  in  these  problems  reduce  to  general  problem: 


subject  to 


N 

MinimizeP(bi, ..,  &tv)  =  E  < h(h ) 

k= i 

N 

Constraint  :  ’£bk  =  B,h£{0,l,-B}, 

k=  1 


(1) 

(2) 
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where  (f)k  is  a  convex  function  nonnegative  integers  bk,  and  B  is  a  positive  integer.  In 
subband  coding 

Mh)  =  ak  2~2b«  (3) 

where  ak  is  determined  by  the  signal  variance  in  the  k- th  subchannel,  and  P(b\, b^) 
is  the  average  distortion  variance.  In  multicarrier  systems 

Mh)  =  ak2bk  (4) 

where  ak  reflect  target  symbol  error  rates  (SER),  and  channel  and  interference  condi¬ 
tions  experienced  in  the  the  k-th  subchannel,  and  P(b\, ..,  b n)  is  the  total  transmitted 
power.  Higher  values  of  ak  reflects  more  adverse  subchannel  conditions  and/or  lower 
target  SER;  bk  is  the  the  number  of  bits  assigned  to  each  symbol  in  the  cognizant 
subchannel. 

The  complexity  of  most  existing  algorithms  for  such  discrete  bit  loading  grows  with 
B.  We  have  formulated  a  new  bit  loading  scheme  whose  complexity  is  independent  of 
B  and  yet  like  existing  schemes  depends  as  0(iVlog  N)  in  the  number  of  subchannels. 

A  comparison  of  the  performance  of  the  algorithms  of  [2]  and  [1]  and  the  proposed 
algorithm  with  respect  to  the  number  of  computations  required  is  shown  in  the  figures 
3  and  4,  for  the  cases  where  N  =  32  and  N  =  64,  respectively.  In  implementing  [1], 
which  is  a  suboptimal  algorithm,  the  maximum  number  of  bits,  B*  that  any  channel 
can  be  assigned  is  kept  at  B. 


Number  of  Channels  =  32 
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0  100  200  300  400  500  600  700  800 

Number  of  Bits  to  be  allocated  B 


Figure  3:  Runtime  comparisions  of  the  three  algorithms  for  N=32 

Number  of  computations  needed  for  each  algorithm  to  converge  to  the  optimal  solu¬ 
tion  was  calculated  by  assuming  that  addition,  subtraction,  div,  mod,  multiplication 
or  division  of  two  numbers  would  need  one  computation  as  would  the  logical  compar¬ 
isons  between  two  decimal  numbers.  The  results  show  that  the  algorithm  described 
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Figure  4:  Runtime  comparisions  of  the  three  algorithms  for  N=64 


in  [1]  is  linear  with  respect  to  B  while  the  algorithm  in  [2]  needs  large  number  of 
computations  to  converge  as  B  grows.  The  number  of  computations  needed  for  the 
proposed  algorithm  is  independent  of  the  change  in  B  the  minor  variations  whose 
source  is  discussed  in  the  paper  referenced  below.  The  improvement  in  performance 
is  very  significant  if  B  is  large  when  compared  to  N. 

Collaboration  with  DOD  Labs:  We  are  currently  part  of  a  team  that  is  initiating 
collaboration  with  TACOM  on  research  on  digital  humans.  Many  of  the  resource  allocation 
ideas  implicit  in  this  work  are  crucial  to  that  project. 
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Subband  Coding  of  Cyclostationary  Signals 

Ashish  Pandharipande  and  Soura  Dasgupta 
October  24,  2001 


Abstract 

We  consider  optimal  orthonormal  filter  banks  for  subband  coding  of  wide  sense  cyclostationary  signals, 
with  IV-periodic  second  order  statistics.  An  M-channel  uniform  filter  bank,  with  IV-periodic  analysis  and 
synthesis  filters,  is  used  as  the  subband  coder.  Dynamic  schemes  involving  IV-periodic  bit  allocation  are 
employed.  An  average  variance  condition  is  used  to  measure  the  output  distortion.  We  show  that  for  at 
least  three  potential  bit  allocation  strategies,  the  optimum  filter  bank  is  a  principal  component  filter  bank. 

Index  Terms:  Subband  coding,  Filter  bank,  Bit  allocation,  Dynamic  schemes,  Majorization  theory. 
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1  Introduction 


Wide  sense  cyclostationary  (WSCS)  signals  arise  in  many  applications,  [1],  [2],  We  consider  optimum 
orthonormal  subband  coding  of  zero  mean  WSCS  signals  with  iV-periodic  second  order  statistics,  i.e. 
signals  that  obey  for  all  k,l:  £[x(k)x*  (l)]  —  £[x(k  +  N)x*(l  +  N )]  where  £[•]  denotes  the  expectation 
operator.  The  subband  coder  itself  is  an  M- channel  maximally  decimated  uniform  filter  bank  (UFB), 
(see  fig.  1),  with  IV-periodic  linear  analysis  and  synthesis  filters,  H,(k,  z)  and  Fjik.  z).  respectively.  Each 
subband  signal  Uj(fc),  is  quantized  at  the  fcth  instant,  by  a  bj{k)  bit  quantizer,  Qt .  Subject  to  bit  rate  and 
orthonormality  constraints,  we  wish  to  allocate  bits  bj(k).  and  select,  H,(k,  z)  and  Fjik.  z)  to  minimize  the 
average  variance  of  x(k)  —  x(k). 

Among  many  possible  bit  rate  constraints  one  can  adopt,  three  are  of  interest  here.  The  first  called 
static  bit  allocation  (SBA)  involves  constant  b,  (k).  and  has  been  studied  in  [6].  The  second  and  third,  both 
assume  Wperiodic  bit  allocation: 

biik  +  N)  =  bi(k).  (1.1) 

In  the  second,  the  average  bit  rate  over  all  the  channels  is  constant  at  each  time  instant ,  i.e.  given  b  and 
all  k, 

(M—l  \ 

6=  bi(k)j/M.  (1.2) 

The  third  assumes  a  fixed  average  bit  rate  over  periods  of  length  N: 

i  N-1M-1 

(i.3) 

k=0  i=0 

Among  these,  (1.1)  requires  the  least  computation  and  (1.3)  is  the  most  general.  On  the  other  hand,  (1.2) 
is  preferred  over  (1.3)  in  applications,  such  as  control  over  networks,  where  the  bit  rate  constraint  must  be 
enforced  at  every  time  instant. 

Recent  studies,  [4,  5]  have  established  that  the  optimum  UFB  subband  coder  for  Wide  Sense  Stationary 
(WSS)  signals  is  a  principal  component  filter  bank  (PCFB),  [3].  In  [6]  we  have  likewise  shown  that  for  SBA 
also  the  optimum  UFB  is  a  PCFB.  The  principal  contribution  of  this  paper  is  to  show  that  even  under 
(1.2)  and  (1.3),  optimality  is  attained  through  PCFB’s,  despite  the  differing  bit  allocation  constraints. 
This  suggests  the  universality  of  PCFB  based  solutions  for  problems  such  as  these. 


Figure  1:  An  M-channel  filter  bank  as  subband  coder. 
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2  Optimum  Bit  Allocation 

For  any  zero  mean  signal  x(k ),  define  c/2(k)  —  £[x2(k)\.  All  subband  signals  vt(k)  have  IV-periodic  second 
order  statistics.  As  in  [4],  [5],  we  assume  that  the  quantizers  are  modeled  by  additive  zero  mean  noise 
sources,  independent  of  the  vfik),  with  variances  of  the  form 

a2qi(k)  =c2-2b*Wal(k),  (2.4) 

with  c  a  distribution  dependent  constant.  Note  that  under  (1.1),  cr2. (k)  are  ALperiodic. 

Observe  that  the  overall  filter  bank  is  MAf-periodic.  Let  E(z)  and  R(z)  be  the  transfer  functions  of 
MN- fold  blocked  versions  of  the  analysis  and  synthesis  banks  respectively.  Define  the  WSS  vectors, 

x(k)  -  [xo(Nk), . . .  ,x0(NK  -  N  +  1), . . .  ,xM-i{Nk), . . .  ,xM-i{Nk  -  N  +  1)]T, 

v(k)  =  [v0{Nk), . . . ,  v0(NK  -N+  1), .  ,  VM-i(Nk), . . . ,  vM-i(Nk  -N  +  1)]T,  (2.5) 

with  power  spectral  density  (PSD)  matrices  Sx{co)  and  Sy(uj)  respectively.  We  assume  Sx(uj)  is  known. 
We  assume  the  perfect  reconstruction  and  orthonormality  conditions, 

E\z)E(z )  —  I  —  R^(z)R(z),  and  R(z)  —  E\z).  (2.6) 


We  propose  to  minimize  the  average  variance  of  q(k)  —  x(k)  —  x(k)  and  under  (1.2)  and  (2.6),  obtain 

JV-l  M—l 


1  MN—l  i  N-lM-1 

E  =  4n  E  E 


MN 


k=0 


k=0  1=0 


> 


MN 
c2 


k=0  1=0 
-2b  N-l  /M—l 


1/M 


N 


e  n  aKk) 

k=0  \ 1=0  / 


with  equality  holding  iff  for  each  i,l,k 

2~2bi^a2.{k)  =2  ~2bl^a2Vi{k). 
Likewise  under  (1.3),  (2.7)  is  lower  bounded  by 

/TV— 1  M—l  \  1/MN 

*2b  nnd  , 

\k=0  1=0  / 

with  the  bound  met  iff  for  each  i.  /,  k i ,  b> 

2-2bi^)a2.(kl)  =  2-2b>^a2i{k2). 


(2.7) 

(2.8) 

(2.9) 

(2.10) 

(2.11) 


Observe,  the  optimum  bit  allocation  scheme  (2.11)  is  more  stringent  than  (2.9). 

Consequently  UFB  selection  reduces  to  the  following  problem: 

Problem  2.1  Consider  the  MN  x  MN  system  E(z)  with  WSS  input  vector  x{k)  with  given  Hermitian 
PSD  matrix  Sx{uj).  Suppose  v(k)  in  (2.5)  is  the  output  of  E(z).  For  (1.2)  (resp.  (1.3))  find  E(z)  such 
that  Ji  (resp.  J2)  is  minimized  subject  to  (2.6). 

TV-1  M—l  TV-1  M—l 

ji  =  e ( n  av,(k))i/M  and  j2  =  n  n  (2-12) 

k=0  1=0  k= 0  1=0 
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While  J2  is  similar  to  the  corresponding  cost  function  in  the  WSS  case,  [5],  J\  is  more  complicated. 
The  difference  stems  from  the  fact  that  implicit  in  (1.2)  are  N-  bit  budgets.  Both  in  turn  are  different  from 
the  cost  function  for  (1.1)  considered  in  [6].  Finally,  while  J2  does  not  change  by  permuting  the  subband 
variances,  J\  does.  Indeed  given  a  set  of  subband  variances  at  different  time  instants  we  need  consider 
only  the  arrangements  that  lead  to  the  minimum  value  of  J\.  Such  optimal  arrangements  are  characterized 
below. 

Optimum  Arrangement:  Among  the  various  permutations  of  a2.  (j).  ones  that  minimize  J\  obeys,  [7]: 

M—l  M—l 

>o-in{k2)  =>  n  avi(k  1)  <  n  avi(k  2)  (2-13) 

For  a  2-channel  filter  bank,  M  —  2,  this  requires  that  the  largest  be  paired  with  the  smallest,  the  second 
largest  with  the  second  smallest  etc.. 


3  Optimum  Subband  Coder 

We  now  characterize  the  optimum  selection  of  E(z),  by  introducing  the  notions  of  majorization  and 
Schur  concavity,  [7]. 

Definition  3.1  Consider  two  sequences  x  —  {xi}f=1  and  y  —  {yi}f=  1  xi  —  xi+ 1  an d  Ui  >  Vi+i-  Then 
we  say  that  y  m,ajorizes  x,  denoted  as  x  -<  y,  if  the  following  holds  with  equality  at  k  —  n 

k  k 

Z>i  <5>,  1  <k<n. 

i= 1  i= 1 

Fact  1  If  H  is  an  n  x  n  Hermitian  matrix  with  diagonal  elements  hi, . . . ,  hn,  and  eigenvalues  Ai, . . . ,  \n, 
then  h  -<  A  on  Rn . 

Definition  3.2  A  real  valued  function  (p(z)  —  f(zi, . . . ,  zn)  defined  on  a  set  A  C  Rn  is  said  to  be  Schur 
concave  on  A  if 

x^y  on  A  =>  (f>{x)  >  (f>{y). 


(f>  is  strictly  Schur  concave  on  A  if  strict  inequality  (f>(x)  >  (f(y)  holds  when  x  is  not  a  permutation  of  y. 
We  will  now  state  a  theorem  that  results  in  a  test  for  strict  Schur  concavity.  We  denote 


A  _  ^(a)  m  _  d2(f(z)  T  fu  n 


dJi 


dzodzn 


dalXk) 


,  and  J\{k,  l,  m ,  n) 


d2Jx 


da'^  (k)da'yn  (rn) 


Theorem  3.1  Let  (f>(z)  be  a  scalar  real  valued  function  defined  and  continuous  on  V  —  {(zi,...,zn)  : 
z zn},  and  twice  differentiable  on  the  interior  of  V.  Then  (f>(z)  is  strictly  Schur  concave  on  V 
ijf:(i)  is  symmetric  in  its  arguments,  (ii)  (f>(k){z)  increasing  in  k,  and  (in)  f(k){z)  —  4>( k+i ){z)  ^ 
4>{k,k){z)  ~  4>{k,k+l){z)  ~  <t>(k+ l,k)iz)  +  4>(k+l,k+l){z)  <  0- 

It  is  known,  that  J>  is  Schur  Concave,  [7].  We  also  have  the  following  lemma. 
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Lemma  3.1  The  real  valued  function  J\  as  defined  in  (2.12)  is  strictly  Schur  concave  under  (2.13). 


Proof:  Clearly  J\  is  symmetric  in  its  arguments  ui  (&),  satisfying  (i)  of  Theorem  3.1.  Note  that 


Ji{k,l)  = 


l/M 


M 


(k) 


If  <r„  (k\)  >  <r„  (fci),  then  under  (2.13)  Ji(kifii)  <  Ji(k2,h),  satisfying  condition  (ii). 


To  establish  (iii),  note  that 

Ji(kJ)  —  Ji(m,n)  <{=> 


_  (nSXw) 


l/M 


al(k) 


°L("0 


Ji(k,  l,  m,n)  —  < 


i-M  into1  <w)1/M  -f, 

^  K(^  lf  k  =  m,l  =  n, 

i  into1  <w)1/M 

m M)  if  k  =  mfi  +  n. 


0 


if  k  ^  m,  l  —  n, 


and  hence,  under  (3.15) 

J\(k,  l,  k ,  /)  —  J\(k,  l,  m ,  n)  —  J\{m,  n.  k ,  /)  +  J\{m,  n,  m,  n)  <  0. 


(3.14) 


(3.15) 


Note  that 

5e(w)  =^(w)5*(w)#(w).  (3.16) 

Since  Sx(cu)  is  Hermitian,  we  may  write 

S*(w)  =  U{u))A{u))U*{u),  (3.17) 

with  U(co)  unitary  and  A(cj)  =  diag  (Ao(cn),  Ai(cn), . . . ,  \mn-i{w)},  with  Xi(co)  >  A*+i(a;)  >  0  at  all  co. 
Then,  [7],  {2'kg(/). {k)}^^^1  -<  {/qW  A j(w)dw}^f^-1.  We  then  have  the  following  result. 

Theorem  3.2  Consider  Problem  2.1  and  all  quantities  defined  therein.  Then  optimality  is  attained  iff 
E(ui)  —  PU\(jv),  where  for  (1.3),  P  is  any  constant  permutation  matrix,  and  for  (1.2),  P  is  any  constant 
permutation  matrix  that  leads  to  an  optimum  arrangement  for  J\  in  (2.12),  when  the  subband  variances 
are  the  normalized  integrals  ofXfiw).  In  this  case,  Sv(uj)  —  PA(iv)P t  . 

Figure  2  shows  the  coding  gains  under  the  two  schemes,  for  an  input  with  5-periodic  statistics.  As 
expected,  (1.3)  leads  to  higher  coding  gains.  Its  disadvantage  is  that  while  the  overall  bit  rate  averaged 
over  N  samples  is  the  same  as  in  (1.2),  it  could  lead  to  time  instants  in  which  the  bit  rate  is  significantly 
lower  than  the  target.  In  certain  time  critical  applications  this  may  not  be  desirable. 
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Coding  Gain  vs  #  of  channels:  Subband  coding  of  WSCS  signal  with  period  5 


#  of  channels 


Figure  2:  Coding  gain  plots. 

4  Conclusions 

We  have  derived  conditions  for  the  optimal  orthonormal  subband  coding  of  iV-WSCS  signals,  using  an 
M- channel  uniform  filter  bank  as  subband  coder  with  JV-periodic  filters  and  two  periodic  bit  allocation 
schemes.  As  with  the  results  of  [6],  where  a  static  bit  allocation  scheme  was  considered,  the  optimum  filter 
bank  in  each  case  is  a  PCFB. 
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ABSTRACT 

We  consider  optimal  orthonormal  filter  banks  for  subband 
coding  of  wide  sense  cyclostationary  signals,  with  iV-periodic 
second  order  statistics.  An  L-channel  over  decimated  uni¬ 
form  filter  bank,  with  JV-periodic  analysis  and  synthesis  fil¬ 
ters,  is  used  as  the  subband  coder.  An  average  variance  con¬ 
dition  is  used  to  measure  the  output  distortion.  We  show  that 
for  at  least  three  potential  bit  allocation  strategies,  the  opti¬ 
mum  filter  bank  is  a  principal  component  filter  bank.  This 
is  in  the  same  vein  as  our  earlier  results  on  subband  coding 
with  maximally  decimated  filter  banks. 

1.  INTRODUCTION 

Wide  sense  cyclostationary  (WSCS)  signals  arise  in  many 
applications,  [1],  [2],  We  consider  optimum  orthonormal 
subband  coding  of  zero  mean  WSCS  signals  with  JV-periodic 
second  order  statistics,  i.e.  signals  that  obey  for  all  k,l: 
£[x(k)x*(l)\  =  £[x(k  +  N)x*(l  +  A7)]  where  £[■]  denotes 
the  expectation  operator. 

The  subband  coder  itself  is  an  X-channel  over  decimated 
uniform  filter  bank  (UFB),  (see  fig.  1),  with 

M  >  L, 

and  A7 -periodic  linear  analysis  and  synthesis  filters,  ff z) 
and  Fi(k,z ),  respectively.  Each  subband  signal  Vi(k),  is 
quantized  at  the  fcth  instant,  by  a  &,(&)  bit  quantizer,  Qi. 
Subject  to  bit  rate  and  orthonormality  constraints,  we  wish 
to  allocate  bits  b{(k),  and  select,  Hi(k,z )  and  Fj(k,z )  to 
minimize  the  average  variance  of  x(k)  —  x(k). 

Among  many  possible  bit  rate  constraints  one  can  adopt, 
three  are  of  interest  here.  The  first  called  static  bit  allocation 
(SB  A)  involves  constant  6j(fe),  and  a  bit  rate  constraint 

b=(j2b^J/L-  a-u 

Supported  by  ARO  contract  DAAD 19-00-1-0534  and  NSF  grants  ECS- 
9970105  and  CCR-9973133. 


The  second  and  third,  both  assume  A7-periodic  bit  allocation: 

bi{k  +  Tf)  =  bi{k).  (1.2) 

In  the  second,  the  average  bit  rate  over  all  the  channels  is 
constant  at  each  time  instant,  i.e.  given  b  and  all  k, 

b=  (j2bi(k)j  /L.  (1.3) 

The  third  assumes  a  fixed  average  bit  rate  over  periods  of 
length  N : 

N-l  i,  i 

J  =  0EEJiW' 

k=0  i=0 

Among  these,  (1.2)  requires  the  least  computation  and  (1.4) 
is  the  most  general.  On  the  other  hand,  (1 .3)  is  preferred  over 
(1.4)  in  applications,  such  as  control  over  networks,  where 
the  bit  rate  constraint  must  be  enforced  at  every  time  instant. 

Subband  coding  under  these  three  constraints,  with  max¬ 
imally  decimated  filter  banks  (i.e.  L  =  M)  has  been  studied 
in  [6]  and  [7].  These  references  show  that,  while  the  op¬ 
timum  bit  allocation  schemes  differ  among  (1.1  -  1.4),  the 
optimizing  Hi(k,  z)  and  Fi(k,  z)  can  be  chosen  as  the  same 
regardless  of  the  allocation  scheme.  In  fact  a  Principal  Com¬ 
ponent  Filter  Bank  (PCFB),  represents  the  common  optimiz¬ 
ing  solution. 

Recent  studies,  [4,  5]  have  established  that  the  optimum 
UFB  subband  coder  for  Wide  Sense  Stationary  (WSS)  sig¬ 
nals  is  a  PCFB,  [3],  The  principal  contribution  of  this  paper 
is  to  show  that  even  on  the  over  decimated  case,  optimality 
is  attained  through  PCFB’s,  despite  the  differing  bit  alloca¬ 
tion  constraints,  reinforcing  the  universality  of  PCFB  based 
solutions  for  problems  such  as  these. 

2.  OPTIMUM  BIT  ALLOCATION 

For  any  zero  mean  signal  x(k),  define  cr2(k )  =  £[x2(k)\. 
All  subband  signals  c,  (k )  have  A7 -periodic  second  order  statis¬ 
tics.  As  in  [4],  [5],  we  assume  that  the  quantizers  are  mod¬ 
eled  by  additive  zero  mean  noise  sources,  independent  of  the 


£(&)  with  equality  holding  iff  for  each  i,  l,  k 


Fig.  1.  An  L -channel  over  decimated  filter  bank  as  subband 
coder. 


—  2bi(k)  ^.2  ^  =  9-26i(*)n-2 


(k). 


Likewise  under  ( 1 .4),  the  lower  bounded  becomes 

\  1  /LN 


c2' 


-2b 


t JV-1  L—l 


N 


n  n  <(*) 


0  /=0  / 
with  the  bound  met  iff  for  each  i,  Z,  k\ ,  k2 


(2.9) 


(2.10) 


vfk),  with  variances  of  the  form 

a2qi(k)=c2-2b^a2Vi(k),  (2.5) 

with  c  a  distribution  dependent  constant.  Note  that  under 
(1.2),  cr2.(k )  are  iV-periodic. 

Observe  that  the  overall  filter  bank  is  M iV-periodic.  Let 
E(z)  and  R{z)  be  the  transfer  functions  of  M JV-fold  blocked 
versions  of  the  analysis  and  synthesis  banks  respectively.  In 
particular,  E(z)  is  LN  x  MN  and  R(z)  is  MN  x  LN. 
A  key  difference  between  the  over  decimated  and  the  maxi¬ 
mally  decimated  cases  is  that  these  transfer  functions  are  no 
longer  square. 

Define  xfik)  =  x(Mk  —  i),  xfk)  =  x(Mk  —  i)and  the 
WSS  vectors, 

x(k)  =  [.r0(.\ /.•),.  ■  ■ ,  x0(NK  -  N  +1),... ,  x  M —l  { Nk ), 

...,xM-i{Nk-N  +  1)]T, 


2-26l(*1)tr2(fci)  =  2  ~2b‘^o2Vi(k2).  (2.11) 

On  the  other  hand  under  the  static  bit  allocation  strategy  of 
(1.1),  as  shown  in  [6],  the  lower  bounded  becomes 

ro-2 b  £-i  /Jv-i  \ 

— n  e  ■  <ii2> 

1=0  \k= 0  / 

with  the  bound  met  iff  for  each  i,  l 

2~2bi  ( £  =  2~2bl  ( £  ■  (2-i3) 

V  k=0  )  V  k=0  ) 

Observe,  the  optimum  bit  allocation  scheme  (2.11)  is  the 
most  stringent  among  (2.9),  (2.1 1)  and  (2.13). 

Consequently  UFB  selection  reduces  to  the  following 
problem: 


v{k)  =  [r„(.\k). Vo(NK  -N+  1), . . . ,  u£_a  (Nk). 

...,vL-1(Nk-N+  1)]T,  (2.6) 

with  power  spectral  density  (PSD)  matrices  Si'(w)  and 
respectively.  Observe, 

v(k)  =  E(z)x(k). 

We  assume  Si(w)  is  known.  We  assume  the  perfect  recon¬ 
struction  and  orthonormality  conditions, 

E(z)E\z)  =  I  =  R)(z)R(z ),  and  R(z)  =  P(z).  (2.7) 


Problem  2.1  Consider  the  LN  x  MN  system  E(z )  with 
WSS  input  vector  x(k)  with  given  Hermitian  PSD  matrix 
Si(u>).  Suppose  v(k )  in  (2.6)  is  the  output  of  E(z).  For 
(1.3)  (resp.  (1.4)),  (resp.  (1.1))  find  E(z)  such  that  J\  (resp. 
J2)  (resp.  J:\)  is  minimized  subject  to  (2.7). 

ji= (2.i4) 

k= 0  1=0 


J2 


n  n 


,k= 0  1=0 


1  /LN 


(2.15) 


We  propose  to  minimize  the  average  variance  of  q(k )  = 
x(k )  —  x(k)  and  under  (1.3)  and  (2.7),  obtain 


LN- i 


TW  £  °*ik)  = 


k= 0 


LN 


k=  0  /=0 
N—l  L  —  l 


> 


c2 


k=  0  1=0 

-2b  W-1  /£-! 


iV 


£(rK(*)l  (2-s> 


fc=0  \  /=0 


L  —  l  /iV-1  \ 

j3=n  (2.!6) 

;=o  \*=o  / 

Observe  all  three  of  (2.14)  -  (2.16)  are  quite  different 
from  one  another.  While  J2  is  similar  to  the  corresponding 
cost  function  in  the  WSS  case,  [5],  -h  and  J3  are  more  com¬ 
plicated.  Further  while  J2  does  not  change  by  permuting  the 
subband  variances,  ,f  and  J3  do.  Indeed  given  a  set  of  sub¬ 
band  variances  at  different  time  instants  we  need  consider 
only  the  arrangements  that  lead  to  the  minimum  value  of  Jj, 
J3.  Such  optimal  arrangements  are  characterized  below. 


Optimum  Arrangement  for  Jy:  Among  the  various  per¬ 
mutations  of  cr'y.(j),  ones  that  minimize  J\  obeys,  [8]: 

L- 1  L- 1 

°2vjk  i)  >  °isk  2)  =►  n  ^,(*1)  <  n  ^,(*2)  (2.i7) 


Further  we  note  the  following  result  from  [8]. 

Theorem  3.1  Let  f  be  a  real-valued  strictly  Schur  concave 
function  defined  and  continuous  on  T>  as  in  Theorem  3.1. 
Then 

x  <w  y  =$■  <p{x)  >  f{y), 


For  a  2-channel  filter  bank,  L  =  2,  this  requires  that  the 
largest  be  paired  with  the  smallest,  the  second  largest  with 
the  second  smallest  etc.. 

Optimum  Arrangement  for  J3:  Among  the  various  per¬ 
mutations  of  a2.  (j),  ones  that  minimize  Jy  obeys,  [6]:  for 
each  Z,  one  partial  sum  equals  the  sum  of  the  N  largest  among 
the  a2.  ( j ),  another  equals  the  sum  of  the  next  N  largest,  etc. 


with  equality  holding  only  if  x  is  a  permutation  of  y. 


We  will  now  state  a  theorem  that  results  in  a  test  for  strict 
Schur  concavity.  We  denote 


<W“) 


dzk 


(Z) 


d2f(z) 
dzidzj  ’ 


3.  OPTIMUM  SUBBAND  CODER 

We  now  characterize  the  optimum  selection  of  E{z),  by 
introducing  the  notions  of  majorization  and  Schur  concavity, 
[8], 

Definition  3.1  Consider  two  sequences  x  =  {z,}"=1  and 
V  =  {//<}”  1  with  x.j  >  Xi+ 1  and  yi  >  yi+ 1.  Then  we  say 
that  y  majorizes  x,  denoted  as  x  -<  y,  if  the  following  holds 
with  equality  at  k  =  n 


and 


DJy 

dal(kY 


and  Jy(k,  Z,  m.  n ) 


d2Jy 

dalt{k)daln{m)' 


Theorem  3.2  Let  6(z)  be  a  scalar  real  valued  function  de¬ 
fined  and  continuous  on  V  =  {(zy, . . . ,  zn)  :  Zy  >  ...  > 
z„j,  and  twice  differentiable  on  the  interior  of  V.  Then  <f>(z) 
is  strictly  Schur  concave  on  V  iff:  (i)  z )  is  increasing 
in  k,  and  (ii) 


k  k 

y ^x.j  <  1  <k<n. 

i= 1  i= 1 


C(/.-)(~)  =  0(/.-ii(~)  <t>(k,k){z)  -  <t>(k,k+±){z) 

—  1  ./.■) (~)  + i /.—  i)(c)  <  0. 


Definition  3.2  Consider  two  sequences  x  =  {xi}li=1  and 
y  =  { r/i }(=1  with  Xi  >  Xi+y  and  yy  >  yi+y.  Then  we  say 
that  y  weakly  supermajorizes  x,  denoted  as  x  -<w  y,  if 

1  1 

y,  x%  >  y  mj  1  <  k  <  1. 

i=k  i=k 

We  also  have  the  following  Fact  from  [8], 

Fact  1  Consider  any  N M  x  NM  Hermitian  matrix  R  with 
eigenvalues  Ai  >  A2  >  ...  >  Xnm,  and  an  LAI  x  LM 
matrix  A  =  with  the  LAI  x  NM  matrix  HZ  obeying 

HZHZ1  =  I.  Then  the  diagonal  elements  Aiy  of  A  obey 

{^NM-LM-l:  ■  ■  ■  ,  AjVAf  }■  (3.18) 

Further  if  AI  =  N, 

-<  {Ai  j ....  Ajvm  }■  (3.19) 


If  only  (i)  holds  then  <j)(z)  is  only  Schur  concave. 

It  is  known,  that  J2  is  strictly  Schur  concave,  [8].  We 
also  have  the  following  lemma. 


Lemma  3.1  The  real  valued  function  Jy  as  defined  in  (2.14) 
is  strictly  Schur  concave  under  (2.17). 


Proof:  Clearly  Jy  is  symmetric  in  its  arguments  n(:i  i  k ) , 
satisfying  (i)  of  Theorem  3.2.  Note  that 


Ji(kJ) 


1  (uU<(k))  ' 

L  a2Vi(k) 


If  a'2^  ( ky )  >  a'2^  ( ky ),  then  under  (2.17) 


(3.20) 


Jy(ky,ly)  <  Jr  (&2,  Z2), 


Definition  3.3  A  real  valued  function  f(z)  =  <f>(zy, . . . ,  zn)  satisfying  condition  (ii). 
defined  on  a  set  A  C  Rn  is  said  to  be  Schur  concave  on  A  if  establish  ('")>  note  l^at 


x  -<  y  on  A  =>  4>(x)  >  4>{y). 


Jy(k,  l)  =  Jy(m,  n )  O 


,(*) 


(nr;<w) 

av  (m) 


1/L 


(j>  is  strictly  Schur  concave  on  A  if  strict  inequality  4>(x)  > 
4>(y)  holds  when  x  is  not  a  permutation  of  y. 


(3.21) 


if  k  =  m,l  =  ra, 


4.  CONCLUSIONS 


/,  m,  n)  =  < 


l/L 


K(*)j 

i 

£2 

o 


i/* 


if  &  =  m,l  ^  n, 
if  A:  ^  m,  1  =  n. 


and  hence,  under  (3.21) 


Ji(k,l,k,l)  —  Ji(k,l,m,n)  —  Ji(m,n,k,l) 

+  Ji(m,n,m,n)  <  0. 


Finally  we  note  that  under  the  pertinent  optimum  arrange¬ 
ment,  ,/:i  is  also  Schur  concave,  but  not  in  the  strict  sense. 
This  follows  from  a  slight  variation  of  the  fact  that  J-2  is 
strictly  Schur  convcave,  see  also  [6],  Note  that 


Si(Lu)  =  E(J)S-x(w)E\u).  (3.22) 


Now  suppose  the  NM  eigenvalues  of  S±(oj),  are 

{Ao(w),  Ai(w)t ....  Aljv-i(w)}, 

with  A i(u>)  >  Ai+i(w)  >  0  at  all  oj.  Define  the  NM  x  LM 
matrix  whose  columns  are  the  unit  eigenvectors  correspond¬ 
ing  to  the  smallest  LN  eigenvalues  of  S*(w).  Observe 

U\u))V(w)  =/. 


Then,  because  of  Fact  1 


N—1,L  —  1 

k=0ji=0 


-< 


W 


if 


A  i(w)dLj} 


MN-l 
i—MN  —  LNm 


Note  the  number  of  diagonal  elements  of  Sf;(oj)  is  less  than 
the  number  of  eigenvalues  of  [oj),  as  the  overdecimated 
condition  forces  E(z)  to  be  rectangular.  Consequently,  un¬ 
like  [6]  and  [7],  where  maximal  decimation  forced  a  square 
E{z),  weak  super  majorization,  rather  than  majorization  must 
be  used. 

We  then  have  the  following  result. 


Theorem  3.3  Consider  Problem  2.1  and  all  quantities  de¬ 
fined  therein.  Then  optimality  is  attained  if  for  a  suitable  fre¬ 
quency  inavriant  permutation  matrix  P,  E(ui)  =  PU^(uf). 

We  note  that  for  J-2  this  solution  is  unique  to  within  an 
arbitrary  permutation  matrix  P.  For  J\  too  this  solution  is 
unique  to  any  permutation  matrix  P  that  enforces  an  opti¬ 
mum  arrangement.  This  is  so  because  both  Ji  and  J2  are 
strictly  Schur  concave.  For  J3,  on  the  other  hand,  even  though 
P  must  enforce  an  optimum  arrangement,  the  solution  is 
by  no  means  unique,  as  J3  is  not  strictly  Schur  concave. 
Nonetheless  it  is  intriguing  that  despite  the  difference  be¬ 
tween  the  Ji,  a  common  E  optimizes  all  three. 


We  have  derived  conditions  for  the  optimal  orthonormal  sub¬ 
band  coding  of  N- WSCS  signals,  using  an  over  decimated 
Z-channel  uniform  filter  bank  as  subband  coder  with  N- 
periodic  filters  three  bit  allocation  schemes.  As  with  the  re¬ 
sults  of  [6],  [7]  an  optimum  filter  bank  in  each  case  is  the 
same  PCFB. 
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OPTIMALITY  IN  MULTICARRIER  COMMUNICATION,  MULTIPLE  DESCRIPTION 
CODING  AND  THE  SUBBAND  CODING  OF  CYCLOSTATIONARY  SIGNALS 
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ABSTRACT 

This  paper  considers  three  different  problems  in  signal  pro¬ 
cessing/communications.  The  first  involves  Multiuser  Dis¬ 
crete  Multitone  Transmission  (DMT).  The  other  two  prob¬ 
lems  concerns  variants  of  subband  coding,  specifically  sub¬ 
band  coding  of  cyclostationary  signals  and  the  multiple  de¬ 
scription  coding.  We  show  that  underlying  each  is  the  same 
optimization  problem  that  can  be  solved  using  the  theory 
of  majorization. 

1.  INTRODUCTION 

This  paper  considers  three  different  problems  in  signal  pro¬ 
cessing/communications.  The  first  involves  Multiuser  Dis¬ 
crete  Multitone  Transmission  (DMT),  [1],  also  variously 
known  as  multicarrier  modulation  or  Orthogonal  frequency 
Division  Multiplexed  (OFDM)  communication.  The  other 
two  problems  concerns  variants  of  subband  coding,  specifi¬ 
cally  subband  coding  of  cy  dost  at  ion  ary  signals  and  the  mul¬ 
tiple  description  coding.  The  paper  demonstrates  that  de¬ 
spite  the  different  antecedents  of  these  three  applications, 
underlying  each  is  the  same  optimization  problem.  We  show 
how  this  problem  can  be  solved  through  the  use  of  the  the¬ 
ory  of  majorization,  [8].  We  begin  by  motivating  the  three 
problems  in  question. 

1.1.  Multiuser  DMT: 

DMT  has  been  adopted  as  the  signaling  standard  in  Asym¬ 
metric  Digital  Subscriber  Lines  (ADSL),  [12]  and  has  been 
proposed  as  the  modulation  scheme  of  choice  in  the  Mill 
Bahama  and  Magic  Wand  wireless  ATM  systems,  [13].  In¬ 
deed  in  advocating  DMT  over  CDMA,  the  following  point 
is  made  in  [14]:  “A  spreading  factor  of  85  (13  kb/s  voice)  or 
128  (8kb/s  voice)  (for  CDMA)  is  used  with  IS-95  to  provide 
about  20  dB  of  processing  gain.  At  much  higher  bit  rates, 
CDMA  systems  must  either  reduce  the  processing  gain  or 
expand  the  bandwidth,  but  neither  may  be  an  attractive 
alternative.” 

We  consider  DMT  in  a  multiuser  environment.  Thus  the 
DMT  system  studied  here  supports  multiple  users,  with 
varying  quality  of  service  (QoS)  requirements,  quantified 
by  their  respective  bit  rate  and  symbol  error  rate  (SER) 
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specifications.  Specifically,  consider  the  DMT  system  as  in 
fig.  1  which  depict  s  an  M- sub  channel  filt  er  bank  model  of 
a  DMT  system.  We  consider  an  overinterpolated  (N  >  M) 
filter  bank  as  the  transceiver.  We  assume  that  the  chan¬ 
nel  C(z)  is  FIR  of  length  k  (preequalization  is  assumed 
to  have  been  done),  and  v(n)  is  additive  colored  noise  with 
known  spectrum.  Thus  for  example  v(n)  could  represent  co¬ 
channel  interference.  Note,  [10]  provides  models  for  cochan¬ 
nel  interference  in  a  variety  of  settings.  To  mitigate  inter- 
symbol  interference  (ISI) ,  a  form  of  redundancy  is  incor¬ 
porated  by  choosing  N  =  M  +  k.  The  transmitting  filter, 
Fk(z)  =  5Z1',,1  '  S  'G,k-  and  Hk(z)  =  Ylf=o 1  z'Ski,  act  as 
modulating  and  demodulating  transforms  respectively.  In  a 
DFT  based  DMT  implementation  [1],  the  IDFT  and  DFT 
axe  used  as  the  modulating  and  demodulating  transforms 
respectively. 

In  this  paper,  as  in  [5]  we  will  consider  more  general 
transformations  leading  to  a  generalized  DMT  system.  To 
capture  a  multiuser  environment,  we  assume  that  there 
are  r- users  each  having  been  assigned  L  subchannels,  i.e. 
M  —  Lr.  Further  the  k-th  user  requires  a  bit  rate  of  t}~ , 
and  an  SER  of  no  more  than  rjk ■  Our  goal  is  to  select 
Fi  and  Hi,  and  distribute  the  bit  rates  among  the  various 
sub- channels  to  achieve  the  above  specifications  with  the 
minimum  possible  transmitted  power ,  given  the  knowledge 
of  C(z )  and  the  spectrum  of  v(n).  The  problem  addressed 
here  thus  directly  generalizes  that  in  [5],  who  also  address 
the  same  power  minmization  issue,  but  assuming  a  single 
user  subject  to  only  one  bit  rate  and  SER  constraint.  The 
multiuser  setting  renders  the  optimization  problem  highly 
nontrivial  in  comparison  to  the  single  user  case  as  will  be 
shown. 


v(n) 


Figure  1.  Filter  bank  based  DMT  model. 


1.2.  Multiple  Description  Coding  (MDC) 

Multiple  description  coding  is  a  variant  of  subband  cod¬ 
ing  in  which  different  parts  of  the  signal  to  be  coded  have 
different  bit  budget  requirements.  Specifically  in  fig.  2  as¬ 
sume  M  —  Lr  the  Qi  are  bi  bit  quantizers  that  are  sub¬ 
ject  to  r  separate  average  bit  rate  constraints.  The  goal 
in  MDC  is  to  have  essentially  r  redundant  coders  as  in¬ 
surance  against  failure  of  one  or  more.  Thus,  channels 
{L(j  —  1)  +  1,  •  •  • ,  L(j  —  1)  +  L  —  1}  represent  the  jf-th  coder. 
Past  work  optimization  related  to  MDC  has  focussed  on  the 
two  coder  case,  with  optimization  directed  at  minimizing 
the  average  distortion  subject  to  the  failure  of  one  coder, 
[15]- [16].  By  contrast  our  goal  is  to  optimize  in  the  fail¬ 
ure  free  case.  Specifically,  we  select  the  LTI  filters  Fi,  Hi , 
and  allocate  bits  among  the  Qi ,  subject  to  the  bit  rate  con¬ 
straints  specified  by  the  problem,  to  minimize  the  average 
output  distortion  variance.  The  optimization  occurs  under 
an  orthonormality  condition,  specifically  that  the  arrgange- 
ments  to  the  left  and  right  of  the  quantizers  are  all  pass. 


Figure  2.  A  Maximally  Decimated  Uniform  Filter 
Bank 


to  be  N- periodic,  that  is 

bi(k  +  N)  =  bi(k).  (1..1) 

Our  goal  is  to  select  bi(k)  and  the  filters  Hi(k,z)  and 
F,(k,z)  so  that  the  average  variance  of  x(k)  —  x(k)  is  min¬ 
imum,  subject  to  orthonormality  and  the  constraint  that 
the  average  bit  rate  at  each  time  instant  is  constant. 

1.4.  Outline 

In  Section  2.  we  expose  the  commonality  of  the  underly¬ 
ing  optimzation  problems.  Section  3.  reviews  the  theory  of 
majorization  and  explains  its  applicability  to  the  solutions 
we  seek.  Section  4.  describes  the  optima.  Section  5.  is  the 
Conclusion. 

2.  FORMULATION 

Underlying  each  of  the  three  problems  there  are  two  es¬ 
sential  tasks.  Optimum  Bit  Allocation  (OB A)  that  given  a 
selection  of  the  filters,  distributes  the  bits  among  the  vari¬ 
ous  subchannels  subject  to  the  bit  rate  constarints.  Filter 
Selection  which  involves  selecting  the  filters  in  an  optimal 
way.  In  this  section  we  will  focus  on  developing  the  objec¬ 
tive  functions  and  demonstrating  that  they  reduce  to  the 
same  form  under  OB  A. 

2.1.  DMT 

Generally,  to  preserve  orthogonality  G  —  [Gij]ffj=1  is  uni¬ 
tary  i.e. 

)  GGh  =  I.  (2. .2) 

One  can  show,  [5],  under  mild  assumptions  on  C(z):  that 
given  any  G  as  in  (2.. 2),  Hi(z)  can  be  found  to  render  the 
Perfect  Reconstruction  (PR),  condition: 


1.3.  Subband  Coding  of  Cyclostationary  Signals 

We  consider  optimum  orthonormal  subband  coding  of  zero 
mean  wide  sense  cyclostationary  (WSCS)  signals.  A  signal, 
x(k)  is  WSCS  with  period  N  if  for  all  fc,  /: 

E[x(k)x*  (/)]  =  S[x(k  +  N)x*  (/  +  N)], 

where  £[•]  denotes  the  expectation  operator. A  wide  variety 
of  man  made  signals  encountered  in  communication,  teleme¬ 
try,  radar  and  sonar  systems,  as  well  as  several  generated 
by  nature  [6],  are  WSCS.  Examples  of  manmade  signals 
exhibiting  cyclostationarity  include  signals  found  in  am¬ 
plitude,  phase  and  frequency  modulation  systems,  periodic 
keying  of  amplitude,  phase  and  frequency  in  digital  modula¬ 
tion  systems,  and  periodic  scanning  in  television,  facsimile 
and  some  radar  systems,  [6].  Further,  [7]  demonstrates  that 
WSCS  models  provide  more  accurate  descriptions  of  speech 
signals  than  do  traditional  WSS  models. 

We  assume  that  the  filter  bank  is  orthonormal.  That  is, 
for  all  square  summable  inputs  x(k ),  the  combined  energy 
of  the  M  subband  signals  Vi(k )  equals  the  energy  in  x(k ), 
and  in  the  absence  of  the  quantizers  the  filter  bank  output 
x(k)  matches  x(k)  for  all  x(k).  It  is  easy  to  show  that 
under  these  conditions  the  subband  signals  are  themselves 
WSCS  with  period  N.  We  adopt  a  P eriodically  Dynamic 
Bit  Allocation  (PDBA)  scheme  where  we  choose  each  bi(k) 


Xi(n)  =  Xi(n),  Vt  €{(),■■■  ,Af  —  1}. 


Let  the  input  power  in  the  j- th  subband  of  the  k-th.  user  be 
o"xj  k  •  Due  to  PR,  this  is  also  the  output  signal  power  <j\  .  k 
in  the  j- th  subband  of  the  k- th  user.  Let  the  output  noise 
power  in  this  subband  be  a*  k ,  and  bj^  be  the  number  of 
bits  allocated  in  this  subchannel.  Due  to  different  QoS  re¬ 
quirements,  we  may  have  different  bit  rate  constraints  for 
the  users.  The  average  number  of  bits  for  the  k-th.  user  is 
bk  —  jr  y^,,1  bj^k-  However  we  need  to  account  for  the  re¬ 
duction  in  bit  rate  due  to  the  zero  padding.  The  average  bit 
budget  for  the  k-th  user  is  then  tk  —  jjbk  — 

With  a  high  bit  rate  assumption  made  on  the  modulation 
system,  we  have,  [5],  for  the  k- th  user 


=  c*2" 


where  the  constant  C}~  depends  on  the  SER  rfk ■  We  seek  to 
minimize  the  average  transmission  power  given  by 


f  =  iE  X>: 


k=l  j=0 


1 

M 


EEc*22 


k=  1  J=0 


(2..3) 

(2..4) 


subject  to  the  bit  rate  budgets 
L  —  l 

^*  =  yE&,’*:,  k  =  l,....,r,  (2. .5) 

j=o 

and  the  PR  requirement.  Now  apply  the  AM-GM  inequality 
that  states  that  the  arithmetic  mean  is  always  greater  than 
the  geometric  mean,  with  equality  iff  the  numbers  whose 
means  they  represent  are  identical.  Thus, 

^  =  ^EEc*23ij’*<*  c->G:' 

k=  1  j= 0 

<2-7) 

k=  1  J— 0 

=IEcA(22jvit  (2-8) 

*=1  j=0 

with  equality  holding  iff  for  all  i,k: 

ck(22Ntk  H  <rlj  k)iJL  =  c,(22Nt‘  n  ^  (2. .9) 

i=o  j=o 


where  qi(k)  is  zero  mean,  white,  independent  from  V{(k) 
and  has  variance 

a2qi=c2-2b'ali.  (2.. 14) 

The  average  output  distortion  is  then  given  by: 

z^EXEW  (2-15) 

/c— 0  j=0 

Because  of  (2..  14)  under  optimum  bit  allocation  one  can 
show  that  the  optimization  problem  reduces  to  finding  an  all 
pass  M  X  M,  E(z),  so  that  (2.. 10)  is  minimized  with  =  1 
and  dj^k  the  variance  of  VjL+i.  Notice  in  this  case  one  must 
find  the  all  pass  operator  E(z)  and  that  the  variance  of 
Vj L+i  are  the  diagonal  elements  of  the  matrix 

(2. .16) 

and  that 

SvM  =  E(e~^)S^)  [£(«-*■')]  f  (2.. 17) 

where  S±  (E)  is  the  known  Power  Spectral  Density  (PSD) 
matrix  of  the  vector 

[ x(k ),  x(k  —  1),  •  •  • ,  x(k  —  Lr  +  l]r. 


This  is  the  optimum  bit  allocation  strategy.  The  optimal 
transceiver  design  is  to  find  unitary  matrix  G  so  as  to  min¬ 
imize 

r  L  —  l 

^=E(-n^)w  (2-10) 

k=l  j—0 


One  can  show  that  the  quantities  cr\.  k  are  the  diagonal 
elements  of  Re  given  by 


Rc  =  GoRcGo,  (2.. 12) 


Of  course  the  all  pass  constraint  reduces  to 

E(e~n  [E(e~nY  =  R  (2. .18) 

2.3.  WSCS 

Suppose,  now  x(k )  in  fig.  2  is  has  iV-periodic  second  order 
statistics.  Then  the  goal  is  to  select  N-periodic  Hi  and 
Fi,  and  bit  allocation  to  minimize  the  distortion  in  x(k), 
subject  to  PR,  and  the  condition  that,  E(z)  the  N M  X  iVM, 
NM- fold  lifted  version  of  the  arrangement  to  the  left  of  the 
quantizers  is  all  pass.  In  this  case  we  select  the  b{  to  be 
iV-periodic  and  subject  to  the  Periodically  Dynamic  Bit 
Allocation:  with  bi[k  +  N)  =  bi(k)  i.e. 


where  Re  is  a  known  matrix  obtained  from  the  statistics  of 
v(n). 

2.2.  MDC 

In  this  case  the  bit  budget  constraint  is: 

L  —  l 

XE'  •'  11  ’  V,  e  -  1}.  (2..13) 

i—0 

One  must  select  the  LTI  filters  Ft ,  Hi  to  be  such  that  in 
the  absence  of  quantizers  £(&)  =  x(k ),  i.e.  the  filter  bank  is 
PR.  In  addition  one  imposes  the  requirement  that  E(z ),  the 
M  X  M,  M-fold  lifted  equivalent  of  the  arrangement  to  the 
left  of  the  quantizers  is  all  pass.  Then  the  goal  is  to  select 
all  pass  E(z)  and  allocate  bits  among  the  subbands,  subject 
to  (2. .13),  and  PR,  so  that  the  average  quantizer  induced 
mean-square  distortion  in  ;£(&)  is  minimized.  Under  high 
bit  rates  the  quantizer  noise  model  is  [11], 

Wi(k)  =  Vi(k)  +  qi(k) 


M-l 

i= 0 

Clearly,  the  subband  signals  Vi(k)  are  themselves  WSCS 
with  period  iV ,  that  is 

a2..(k)  =  a2,,  (k  +  N ). 

We  will  assume  that  the  quantizers  are  modeled  by  addi¬ 
tive  zero  mean  noise  sources,  independent  of  the  Vi(k ),  with 
variances  of  the  form 

t T2q.{k)  =  c2~2hi(k)  rj2v.(k).  (2. .19) 

Note  that  under  (1..1),  CTg.(k)  are  N-periodic.  Then  the 
distortion  q(k)  =  x(k)  —  x(k)  is  WSCS  with  period  MN, 
[17].  We  propose  to  minimize  the  average  variance  of  q(k ), 

MN  —  l  N- 1  M-l 

I Jn  E  ^W  =  EvEE  <  W  (2-2°) 

k= 0  k—0  1=0 


Under  optimum  bit  allocation  one  can  show  that  one 
must  now  find  all  pass  E(z )  so  that  the  following  is  mini¬ 
mized: 

N-i  / M — 1  \  1/M 

jsbc = ^2  ( n  )  (2--21) 

3  =  0  V  i= 0  / 

Note  tile  similarity  to  (2. .10).  Again  &Z  (j)  arte  tile  di¬ 
agonal  elements  of  a  matrix  as  in  (2. .16),  with  (2. .17)  and 
(2. .18)  in  force.  Now  however,  Sr  (A )  and  ,Sy.  (—')  are  respec¬ 
tively,  the  PSD’s  of 

£.{k)  =  [xo{Nk),...,x0{NK  -  N  +  1), 

. .  . ,  XM-i(Nk), .  . .  ,xM-i  (Nk  -  N  +  1)], 

V(fc)  =  [v0{Nk),...,v0(NK -N  +  1), 

. . .  ,  vM-i{Nk), . . . ,  vM-i  (Nk  —  N  +  1)]. 

3.  MAJORIZATION 

We  define  majorization  and  Schur  concavity  [8]. 

Definition  3..1  Consider  two  sequences  x  =  { x , } and 
y  =  with  Xi  >  and  y,  >  .  Then  we  say 

that  y  majorizes  x,  denoted  as  x  ^  y,  if  the  following  holds 
with  equality  at  k  =  n 

k  k 

'y  ^  Xi  <  y  ^  yi ,  1  <  A;  <  ra. 

i=i  i=i 

Definition  3. .2  A 

real  valued  function  <f>(z)  =  4>{z\ , . .  .  ,zn)  defined  on  a  set 
A  C  Rn  is  said  to  be  Schur  concave  on  A  if 

x  -<  y  on  A  =>  <j)(x)  >  <f)(y). 


and 

L  —  l  L  —  l 

OLm  ^  & j,m  ^  ^j,n-  (3.. 23) 

j=0  j=0 

Call  J  under  such  an  optimum  arrangement  J* .  Then  one 
can  show  from  Theorem  3..1  that: 

Theorem  3. .3  The  real  valued  scalar  function  J  as  defined 
in  (2..  10)  under  the  optimality  conditions  (3. .22-3. .23)  is 
strictly  Schur  concave. 

4.  THE  SOLUTION 

All  three  problems  have  remarkably  similar  structure.  Us¬ 
ing  the  theory  of  majorization,  and  in  particular  Theorems 

3..  2  and  3.. 3,  one  can  show  the  following. 

•  For  DMT,  the  optimizing  G  is  to  within  a  permuta¬ 
tion  matrix  the  Karunen-Loeve  Transform  matrix  of 
the  autocorrelation  matrix  of  the  Af-fold  lifted  version 
of  Vi  ( n ). 

•  For  Multiple  Description  Coding  the  solution  is  as  fol¬ 
lows.  Suppose  is  the  Power  Spectral  Density 

(PSD)  matrix  of  the  vector  of  V{(n)  in  fig.  2.  Sup¬ 
pose  A(lo)  is  the  diagonal  matrix  of  the  eigenvalues  of 

A*(lj)  ,  with  A i(oj)  >  Ai_|_i(cj).  Define  Q(u)  to 
be  the  matrix  that  is  unitary  at  all  uj  and  in  addition 
forces 

(lj)  =  A(u). 

Then  for  a  constant  permutation  matrix,  P,  E(eJUJ)  = 
PQ(oj). 

•  The  solution  for  the  WSCS  problem  is  trivially  similar. 

Here  the  frequency  invariant  permutation  matrix  is  used 
to  achieve  the  optimum  arrangement  exemplified  in  (3.. 22- 

3..  23). 


We  will  now  state  a  theorem  that  results  in  a  test  for 
strict  Schur  concavity.  We  denote 


<t>{k)(z) 


d<f>(z) 

dzk 


and  0(il(){s)  = 


d2m 

dzidzj 


Theorem  3..1  Let  <j>(z)  be  a  scalar  real  valued  function  de¬ 
fined  and  continuous  on  D,  and  twice  differentiable  on  the 
interior  of  D .  Then  <f){z)  is  strictly  Schur  concave  on  D 
■iff  (i)  (j)  is  symmetric  in  its  arguments,  (ii)  (f)^k^(z)  is  in¬ 
creasing  in  k,  and  (in)  4>(k)(z )  =  <f{k+1)(z )  =>  4>(ktk)(z)  - 
4>(k,k+i){z)  ~  4>(k+i,k)iz)  +  <t>(k+ i,k+i)(z)  <  0. 

Theorem  3. .2  If  H  is  an  n  x  n  hermitian  matrix  with  di¬ 
agonal  elements  hi , . . . ,  hn  and  eigenvalues  Ai, .  . . ,  \n,  then 
h-<\  on  Rn. 


Now  turn  to  (2..  10),  the  common  objective  function  for 
all  three  problems.  Suppose,  given  {ay^}  one  were  to  seek 
the  rearrangement  of  these  to  achieve  the  minimum  value 
possible.  Then  it  follows  from  [18]  that  the  optimum  ar¬ 
rangement  must  obey  the  following  property: 


L  —  l  L  —  l 

m,ki  ^  @,n,k2  &ki  —  ^ky,  J.  @'j,k2 

j^m  j^n 


(3.. 22) 


5.  CONCLUSION 

We  have  shown  that  three  problems  in  signal  processing 
and  communications,  with  differing  motivations  and  genesis 
have  similar  solutions.  All  three  benefit  from  the  powerful 
and  elegant  theory  of  Majorization.  Given  that  both  MDC 
and  WSCS  problems  relate  to  subband  coding  similarity  in 
their  solutions  is  not  a  surprise.  That  they  are  also  equiva¬ 
lent  to  the  optimal  DMT  problem  can  be  attributed  to  the 
fact  that  the  optimum  DMT  can  be  interpreted  as  a  dual 
problem  to  subband  coding. 
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ABSTRACT 

In  this  paper,  we  develop  an  optimum  compaction  filter 
based  design  of  filter  banks  for  the  subband  coding  of  wide- 
sense  cyclostationary  (WSCS)  signals.  The  design  of  the 
optimal  orthonormal  filter  bank  is  specified  in  terms  of  op¬ 
timal  compaction  filters.  Each  filter  is  designed  by  just  the 
apriori  knowledge  of  the  cyclic  autocorrelation  of  the  input 
WSCS  signal.  This  design  theory  is  developed  by  first  pro¬ 
viding  further  insight  to  the  optimal  compaction  filter  design 
in  the  case  of  wide-sense  stationary  (WSS)  signals. 

1.  INTRODUCTION 

The  energy  compaction  concept  plays  an  important  role  in 
subband  coding  theory,  and  energy  compaction  filters  find 
applications  in  the  design  of  orthonormal  subband  coders. 
Optimality  of  filter  banks  is  in  the  sense  of  maximizing  cod¬ 
ing  gain,  which  is  a  measure  of  the  distortion  due  to  subband 
quantization.  The  optimal  orthonormal  subband  coding  of 
WSS  and  WSCS  signals  has  been  treated  in  [10],  [11]  and 
[1],  [6]  respectively.  It  has  been  shown  [8],  [11]  that  the  op¬ 
timal  orthonormal  filter  bank  in  the  WSS  case  can  be  con¬ 
structed  by  designing  the  analysis  filters  one  at  a  time  by 
choosing  them  to  be  optimal  compaction  filters  for  appro¬ 
priate  power  spectral  densities  (psds)  derived  from  the  in¬ 
put  signal  psd.  Compaction  filters  thus  are  of  interest  due 
to  their  connection  with  optimal  subband  coding  and  prin¬ 
cipal  component  filter  banks.  Compaction  filters  have  been 
treated  in  some  detail  for  the  WS  S  case  in  [8] ,  [9] ,  [  1 1  ] .  The 
compaction  filter  was  formulated  as  an  eigen  problem  in  [8] 
and  given  a  principal  component  approach.  In  [9],  com¬ 
paction  filters  were  derived  by  an  energy  analysis.  Proper¬ 
ties  of  compaction  filters  have  been  further  studied  in  [11], 

In  this  paper,  we  develop  the  compaction  filter  concept 
in  the  context  of  WSCS  signals.  Cyclostationarity  is  exhib¬ 
ited  by  some  parameters  for  most  manmade  signals  encoun¬ 
tered  in  communication,  telemetry,  radar,  and  sonar  sys- 

This  work  was  supported  by  US  Army  contract,  DAAAD1 9-00-1  - 
0534,  and  NSF  grants  ECS-9970105  and  CCR-9973133 


terns,  [2].  Considering  these  underlying  periodicities  and 
modeling  the  random  signals  as  cyclostationary  can  lead  to 
improvements  in  performance  of  signal  processors. 

A  signal  x(k)  is  WSCS  with  period  M  if  for  all  k ,  l: 

£[x{k)x*(l)\  =  £[x{k  +  M)x*(l  +  M)], 

where  £[•]  denotes  the  expectation  operator.  The  subband 
coder  considered  is  an  iV-channel  uniform  maximally  dec¬ 
imated  filter  bank  depicted  in  fig.  1 .  We  will  consider  a  2- 
channel  (N  =  2)  subband  coder  in  this  paper  to  illustrate  the 
ideas.  Here  x(k)  is  WSCS  with  period  M  and  at  time  k,  Qj 
is  a  bi(k)- bit  quantizer.  When  x(k)  is  Wide  Sense  Station¬ 
ary  (WSS)  one  selects  the  analysis  filters  Hj{k,  z)  and  the 
synthesis  filters  Fj{k,z )  to  be  linear  time  invariant  (LTI). 
Since  x(k)  here  is  WSCS  with  period  M,  we  assume  that 
these  filters  are  Linear  Periodically  Time  Varying  (LPTV) 
with  period  M.  A  linear  filter  with  impulse  response  h(k,  l ) 
is  called  M- periodic,  if  for  all  1: 

h(k,l)  =  h(k  +  M,l  +  M).  (1.1) 

The  time  index  k  in  Hj(k ,  z)  and  Fj(k,  z)  recognizes  their 
lack  of  time  invariance. 

Given  the  autocorrelation  of  the  WSCS  signal  x(k),  the 
psd  matrix  of  the  Af -blocked  version  of  x(k),  which  is  WSS, 
can  be  found  and  will  be  assumed  to  be  known.  We  are  thus 
interested  in  designing  the  filters  Hj(k,z),  l-){k,z)  by  an 
energy  compaction  approach.  We  shall  suitably  extend  the 
idea  of  an  optimum  compaction  filter  for  WSCS  signals  in 
this  paper.  We  show  that  the  optimal  orthonormal  filter  bank 
for  a  WSCS  input  can  be  designed  by  choosing  each  analy¬ 
sis  filter  to  be  an  optimum  compaction  filter  for  an  appropri¬ 
ately  ’’peeled-off  signal,  with  the  peeled-off  signal  derived 
from  the  input  signal.  We  first  provide  useful  insight  to  the 
compaction  filter  design  for  WSS  signals.  This  analysis  is 
then  paralleled  to  the  case  when  the  input  to  the  filter  bank 
is  WSCS. 

In  Section  2,  we  recapitulate  the  optimum  compaction 
filter  design  for  WSS  signals.  The  results  essentially  pro¬ 
vide  a  different  interpretation  to  the  compaction  process  de¬ 
scribed  in  [11].  Section  3  then  introduces  the  notion  of  an 


Fig.  1.  An  TV-channel  filter  bank  as  subband  coder 

optimum  compaction  filter  as  applicable  to  WSCS  inputs, 
and  then  describes  the  energy  compaction  approach  to  sub¬ 
band  coder  design  when  the  input  signals  are  WSCS.  Sec¬ 
tion  4  sums  the  contributions  of  this  paper. 

Notation:  For  compactness,  we  use  [;r(0  :  1  :  N  —  1)] 
to  denote  the  vector 

[;r(0),  .t(1),  .  . .  ,  x(N  —  1)], 

and  [;c(0  :  —1  :  N  —  1)]  to  denote  the  vector 

[x(0),x(-l),...,x(-N  +  1)]. 

We  use  [X(z)]t  to  denote  the  transposed  conjugate  of  the 
matrix  [Al(;8_1)]. 

2.  OPTIMUM  COMPACTION  FILTERS  FOR  WSS 
SIGNALS 

Consider  fig.  2.  The  filter  H(z)  is  said  to  be  an  optimum 
compaction  filter  for  the  zero-mean  WSS  input  signal  x(k) 
if  it  maximizes  the  output  variance  o2  subject  to  the  con¬ 
straint  that  \H(co)\2  is  Nyquist-iV,  that  is, 

^  2t rk 

^2\H(u  - —)\2  =  N,  Vw. 

k= 0 


Fig.  2.  Illustration  for  compaction  filter 

We  now  state  a  result  which  essentially  describes  the 
optimum  compaction  filter  design  process  for  WSS  signals. 

Theorem  2.1  Consider  fig.  1  with  WSS  x(k).  Given  that 
H0  (co)  is  an  optimum  compaction  filter  of  the  signal  x(k). 
Then  the  analysis  filter  H  i(co)  is  an  optimum  compaction 
filter  of  x^fk)  =  x(k)  —  x(fc), . . . ,  and  Ffjv-i(w)  is 
an  optimum  compaction  filter  of  x^N  ~2\k)  =  x^N  ~2\k)  — 


Observe  that  Theorem  2.1  is  a  different  interpretation  to 
the  design  methodology  discussed  in  [11]. 


)3.  OPTIMUM  COMPACTION  FILTERS  FOR  WSCS 
SIGNALS 

Define  for  i  =  0,1, 

Xj(k)  =  x(2  k  —  i),  sfk)  =  s(2k  —  i), 

and  denote  the  M  blocked  versions  of  Xi(k),Si(k),  i  =  0,1 
as 

Xi(k)  =  [x(2(Nk  -  (0  :  1  :  N  -  1))  -  i)]T , 
sdk)  =  [s{2(Nk  -  (0  :  1  :  N  -  1))  -  i)]T . 

Call 

x(k)  =  [x^(k),xJ(k)]T, 

s(k)  =  [sZ(k),sf(kr. 

Every  LPTV  system  with  period  M  can  be  interpreted 
as  a  multiple-input/multiple-output  system  LTI  system  with 
M  inputs  and  M  outputs.  Let  hmn(k ,  l )  be  the  M  x  M  im¬ 
pulse  response  matrix  relating  xri(k)  and  sm(k).  Obviously 
this  is  an  LTI  system,  and  we  can  define 

Hmn{f)  —  ^  '  hmn(k)z 
k 

Note  that  ffoo (?)  and  Hoi(z)  respectively  represent  the  LTI 
systems  relating  the  blocked  even  and  odd  samples  of  the 
input  of  the  M -periodic  system  H(  k,  z)  to  the  blocked  even 
samples  of  the  output  of  H{k,  z). 

Define  the  2 M  x  1  vector 

v(k)  =  [vo(Nk-(0  :  1  :  W-l)),  Vl  (Nk-(N-1  :  -1  :  0)]T. 

Even  when  the  analysis  and  synthesis  filters  are  LPTV, 
the  polyphase  representation  shown  in  fig.  3  still  holds, 
[12],  The  analysis  and  synthesis  sides  in  fig.  1  are  respec¬ 
tively  replaced  by  the  2 M  x  2 M  LTI  operators  E(z),  R(z). 
Note  that  the  operator  E(z )  relates  the  2M  x  1  vectors  x(k) 
and  v(k). 

Perfect  reconstructability  reduces  to  the  requirement  that 
R{z)  =  E~1{z),  and  orthonormality  to  the  requirement  that 
for  all  co 

[E(w)]+  E(co)  =  R(co)  =  I. 

Since  ,S'.? (cj),  the  psd  matrix  of  x(k),  is  positive  definite 
Hermitian  symmetric,  it  can  be  expressed  as 

Sz(w)  =  U(u)A{co)  [U(w)]f  (3.2) 

with  U(lo)  unitary  at  all  co,  and 

A(cj)  =  diag  {A0(w),  •  •  •  .A2jv-t(w)}  (3.3) 

obeying  at  all  co 

A i(co)  >  Aj+i(w)  >  0. 


(3.4) 


Further,  St ,(w),  the  psd  matrix  of  v(k),  obeys 

5t-,M  =EM5sM  [^(c.')]1-  (3.5) 

The  canonical  solution  to  the  subband  coding  problem 
under  dynamic  bit  allocation  and  static  bit  allocation  treated 
in  [1],  [6]  respectively  is  given  by 

E(u)=[u(u>)]\  (3.6) 

In  this  case  Sa(u)  =  A (w).  The  H.j(k,z )  and  Ft(k,z )  can 
be  obtained  by  unblocking  E(z)  and  E~1(z). 

The  main  points  of  this  Section  are  to  define  an  opti¬ 
mum  compaction  process  for  WSCS  signals  that  leads  to 
the  design  of  the  subband  coder.  To  do  so  we  first  de¬ 
fine  the  M -periodic  optimum  compaction  of  an  M -periodic 
WSCS  process.  The  definition  of  optimum  compaction  of 
a  WSS  signal  given  earlier  involves  a  filter  H(z)  for  which 
H(z)H*(z _1)  is  Nyquist-2.  For  an  LPTV  system  H  de¬ 
fine  the  adjoint  filter  Ha  whose  impulse  response  ha{k,  l ) 
is  related  to  the  impulse  response  h(k.l)  of  II  by 

ha(k,  l)  =  h*(l,  k).  (3.7) 

Observe  that  the  adjoint  of  an  LTI  systems  with  transfer 
H(z)  has  transfer  function  H*(z~1).  Thus  the  analog  of 
a  system  with  transfer  function  H(z)H*(z^i)  in  the  Linear 
Time  Varying  (LTV)  case,  is  the  LTV  system  II II". 


q 


Fig.  3.  Blocked  polyphase  representation 

We  now  give  the  appropriate  definition  of  LTV  Nyquist-2 
filters.  When  H{k,  z)  is  M -periodic  then  II II "  is  Nyquist- 
2  iff  for  all  u>. 


We  now  define  the  optimum  compaction  process  for  WSCS 
signals. 

Definition  3.1  Consider  fig.  2,  with  x(Ij  WSCS  with  pe¬ 
riod  M,  H  LPTV  with  period  M,  and  N  =  2.  Then  H  is  an 
optimum  compaction  filter  for  x(k)  if  subject  to  HHa  being 
Nyquist-2,  and  for  some  index  set  { ko ,  k\  .■■■■,  k  \  i  ~  \  }  = 
{0,  •  ■  • ,  M  —  1},  it  simultaneously  maximizes  the  partial 
variance  sums 

l 

(3.9) 

i=0 


for  all  0  <  l  <  M  —  1. 

Observe  that  this  definition  is  targetted  to  accomodate  the 
fact  that  v(k)  is  WSCS  with  period  M.  Consequently  M 
variance  values  must  be  considered.  In  the  sequel,  we  will 
call  an  optimum  compaction  filter  canonical  if  in  (3.9), 

h  =  i. 

Note  that  even  the  canonical  optimum  compaction  filter  is 
nonunique.  This  is  consistent  with  the  fact  that  LTI  opti¬ 
mum  compaction  filters  for  WSS  processes  are  also  nonunique, 
[11]. 

We  now  state  the  main  results  of  this  Section. 

Theorem  3.1  Consider  fig.  1  with  N  =  2,  Hfik,z )  M- 
periodic,  andx(k)  WSCS  with  period  M.  Then  the  Ho(k,  z) 
provided  by  the  canonical  solution  (3.6)  is  an  M -periodic 
canonical  optimum  compaction  filter  ofx(k). 

Denote  by  Sfiu)  the  M  x  M  psd  matrix  of  the  M  x  1 
WSS  vector  x(k)  obtained  by  blocking  x(k).  We  can  write 

% M  =  vMAM[vM]t 

with  V{co)  unitary  for  all  u).  Denote 

I>M=  v{^  rr°_2,)  .  (3.10) 

Theorem  3.2  Given  Hoik,  z  )  is  an  optimum  compaction 
filter  of  the  W5C5  signal  x(k).  Then  the  filter  Hi(k,z)  is 
an  optimum  compaction  filter  of  the  signal  obtained  by  un¬ 
blocking  the  2M x  1  vector  xj  'f  k)  =  |V(u;)  —  j  x(k), 

where  Ho(w),  a  2 M  x  2 M  matrix,  is  the  2M  blocked  ver¬ 
sion  of  Ho(k,  z). 

Theorem  3 . 1  together  with  3 . 2  defines  the  optimum  com¬ 
paction  filter  design  for  WSCS  signals. 

4.  CONCLUSIONS 

In  this  paper,  we  presented  the  optimum  compaction  filter 
design  for  the  subband  coding  of  WSCS  signals.  We  first 
gave  a  different  interpretation  to  the  compaction  process  in 
the  case  of  subband  coding  of  WSS  signals.  The  design  pro¬ 
cedure  was  then  extended  to  WSCS  signals  by  considering 
a  2-channel  subband  coder.  We  showed  that  the  analysis  fil¬ 
ters  of  the  orthonormal  filter  bank  can  be  designed  sequen¬ 
tially  by  an  optimal  compaction  process,  with  the  optimum 
compaction  filters  designed  with  the  apriori  knowledge  of 
just  the  second  order  statistics  of  the  input  WSCS  signal. 


5.  REFERENCES 


[1]  S.  Dasgupta,  C.  Schwarz,  B.D.O.  Anderson,  “Op¬ 
timum  subband  coding  of  cyclostationary  signals”, 
IEEE  International  Conference  on  Acoustics,  Speech 
and  Signal  Processing  -  Proceedings,  pp  1489-1492, 
Mar  15-Mar  19  1999. 

[2]  W.  A.  Gardner,  “Exploitation  of  spectral  redundancy 
in  cyclostationary  signals”,  IEEE  Signal  Processing, 
pp  14-36,  Apr  1991. 

[3]  R.A.  Horn,  C.R.  Johnson,  Matrix  Analysis,  Cambridge 
University  Press,  1999. 

[4]  P.  Moulin,  K.M.  Mihcak,  “Theory  and  design  of 
signal-adapted  FIR  paraunitary  filter  banks”,  IEEE 
Transactions  on  Signal  Processing,  pp  920-929,  Apr 

1998. 

[5]  S.  Ohno,  H.  Sakai,  “Optimization  of  filter  banks  using 
cyclostationary  spectral  analysis”,  IEEE  Transactions 
on  Signal  Processing,  pp  2718  -2725,  Nov  1996. 

[6]  A.  Pandharipande,  S.  Dasgupta,  “Subband  coding 
of  cyclostationary  signals  with  static  bit  allocation”, 
IEEE  Signal  Processing  Letters,  pp  284  -286,  Nov 

1999. 

[7]  V.  Sathe,  P.P.  Vaidyanathan,  “Effects  of  multirate  sys¬ 
tems  on  the  statistical  properties  of  random  signals”, 
IEEE  Transactions  on  Signal  Processing ,  pp  131-146, 
Jan  1993. 

[8]  M.K.  Tsatsanis,  G.B.  Giannakis,  “Principal  compo¬ 
nent  filter  banks  for  optimal  multiresolution  analysis”, 
IEEE  Transactions  on  Signal  Processing ,  pp  1766- 
1776,  Aug  1995. 

[9]  M.  Unser,  “On  the  optimality  of  ideal  filters  for  pyra¬ 
mid  and  wavelet  signal  approximation”,  IEEE  Trans¬ 
actions  on  Signal  Processing ,  pp  3591-3596,  Dec 
1993. 

[10]  P.P.  Vaidyanathan,  “Properties  of  optimal  compaction 
filters  in  subband  coding”,  IEEE  Digital  Signal  Pro¬ 
cessing  Workshop,  pp  89-92,  Sep  1-4  1996. 

[11]  P.P.  Vaidyanathan,  “Theory  of  optimal  orthonormal 
subband  coders”,  IEEE  Transactions  on  Signal  Pro¬ 
cessing,  pp  1528-1543,  Jun  1998. 

[12]  P.P.  Vaidyanathan,  S.K.  Mitra,  “Polyphase  networks, 
block  digital  filtering,  LPTV  systems,  and  alias-free 
QMF  banks:  a  unified  approach  based  on  pseudocircu- 
lants”,  IEEE  Transactions  on  Acoustics,  Speech,  and 
Signal  Processing,  pp  381-391,  Mar  1988. 


OPTIMAL  TRANSCEIVERS  FOR  DMT  BASED  MULTIUSER  COMMUNICATION 


Ashish  Pandharipande  and  Soura  Dasgupta 


Electrical  and  Computer  Engineering,  The  University  of  Iowa,  Iowa  City,  IA-52242,  USA. 
Email:  pashish@engineering.uiowa.edu  anddasgupta@eng.uiowa.edu 


ABSTRACT 

This  paper  considers  discrete  multitone  modulation  ( DMT) 
for  multiuser  communications  where  different  users  are  sup¬ 
ported  by  the  same  system.  These  users  may  have  differing 
quality  of  service  (QoS )  requirements,  as  quantifi  edby  their 
respective  bit  rate  and  symbol  error  rate  specifi  cations.  Our 
goal  is  to  minimize  the  transmitted  power  given  the  QoS 
specifi  cations  for  the  different  users,  subject  to  the  knowl¬ 
edge  of  colored  interference  at  the  receiver  input.  In  particu¬ 
lar  we  fi  ndan  optimum  bit  loading  scheme  that  distributes  the 
bit  rate  transmitted  across  the  various  subchannels  belong¬ 
ing  to  the  different  users,  and  subject  to  this  bit  allocation, 
determine  an  optimum  transceiver. 

1.  INTRODUCTION 

The  discrete  multitone  modulation  (DMT)  channel  cod¬ 
ing  scheme  has  established  itself  as  an  effective  high  rate 
data  communication  technique  in  both  wired  and  wireless 
environments  and  is  used  for  example  in  ADSL  and  HDSL, 
[1].  We  consider  DMT  in  a  multiuser  environment.  Thus 
the  DMT  system  studied  here  supports  multiple  users,  with 
varying  quality  of  service  (QoS)  requirements,  quantifi  edby 
their  respective  bit  rate  and  symbol  error  rate  (SER)  specifi  - 
cations. 

Specifi  cally  consider  the  DMT  system  as  in  fi  g.  1  which 
depicts  an  M -subchannel  fi  lterbank  model  of  a  DMT  sys¬ 
tem.  We  consider  an  overinterpolated  ( N  >  M)  fi  lterbank 
as  the  transceiver.  We  assume  that  the  channel  C(z)  is  FIR 
of  lenth  k  (preequalization  is  assumed  to  have  been  done), 
and  v  (n)  is  additive  colored  noise  with  known  spectrum. 
Thus  for  example  v(n  ')  could  represent  co-channel  interfer¬ 
ence.  Note,  [8]  provides  models  for  cochannel  interference 
in  a  variety  of  settings.  To  mitigate  intersymbol  interference 
(ISI),  a  form  of  redundancy  is  incorporated  by  choosing 
N  =  M  +  k.  The  transmitting  fi  Iters,  Ff.(z),  and  the  re¬ 
ceiving  fi  Iters, Hk{z),  are  constrained  to  length  N,  and  act 
as  modulating  and  demodulating  transforms  respectively.  In 
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a  DFT  based  DMT  implementation  [1],  the  IDFT  and  DFT 
are  used  as  the  modulating  and  demodulating  transforms 
respectively. 

In  this  paper,  as  in  [5]  we  will  consider  more  general 
transformations  leading  to  a  generalized  DMT  system.  To 
capture  a  multiuser  environment,  we  assume  that  there  are  L- 
users  each  having  been  assigned  M/L  subchannels.  Further 
the  fc-th  user  requires  a  bit  rate  of  i/.,  and  an  SER  of  no  more 
than  rjf. .  Our  goal  is  to  select  Ft  and  H,.  and  distribute  the  bit 
rates  among  the  various  sub-channels  to  achieve  the  above 
specifi  cationswith  the  minimum  possible  transmitted  power. 
The  problem  addressed  here  thus  directly  generalizes  that 
in  [5],  which  also  addresses  the  same  power  minimization 
issue,  but  assuming  a  single  user  subject  to  only  one  bit 
rate  and  SER  constraint.  The  multiuser  setting  renders  the 
optimization  problem  highly  nontrivial  in  comparison  to  the 
single  user  case.  Further  we  show  as  much  as  8  dB  and  12 
dB  savings  in  transmit  power  in  our  simulations  with  general 
DMT  systems  over  DFT  based  DMT  systems  with  optimal 
bit  allocation  and  no  bit  allocation  respectively. 

Related  literature  includes  [4]  which  develops  fast  load¬ 
ing  algorithms  using  table  lookups  and  a  fast  Lagrange  bi¬ 
section  method  for  a  single  user  setting.  [7]  considers  a 
single  user  optimization  of  the  transceiver  mutual  informa¬ 
tion.  [3],  considers  the  optimum  bit  loading  problem  when 
two  users  are  present. 

Section  2,  defi  nesthe  generalized  DMT  system  and  for¬ 
mulates  a  precise  mathematical  problem.  Sections  3  and  4 
respectively  consider  the  bit  rate  allocation  and  fi  Iter  selec¬ 
tion  problems.  Section  5  gives  simulations. 

2.  DMT  BASED  MULTIUSER  SYSTEM  MODEL 

In  this  Section  we  give  some  preliminaries.  Specifi  cally 
in  Section  2.1,  we  recount  the  details  of  the  generalized 
DMT  system  provided  in  [5].  Section  2.2  provides  a  precise 
optimization  problem. 

2.1.  Polyphase  representation  of  the  DMT  system 

Consider  the  fi  lterbank  based  DMT  model  in  fi  g.  1.  v(n) 
is  a  zero  mean  wide  sense  stationary  additive  noise.  As 


where  Gq  is  an  arbitrary  M  x  M  unitary  matrix.  The  con¬ 
dition  for  perfect  reconstruction  (PR)  is  given  as 


v(ri) 


SC(z)G  =  I 

Using  (2.1)  and  (2.3),  the  PR  condition  reduces  to 
SC0G0  =  I 


(2.4) 


(2.5) 


Fig.  1.  Filter  bank  based  DMT  model. 


the  fi  Iters  Fk  (z)  and  Hk{z)  have  lengths  <  N,  we  may 
write  the  following  polyphase  decompositions:  Fk(z)  = 
Eto1  z~ZGi,k,  and  Hk(z)  =  E^Iq1  z'Skii,  with  constant 
Ghj  and  Si:J.  Defi  nethe  N  x  M  matrix  G  with  ij- th  ele¬ 
ment  Gij  and  the  M  x  N  matrix  S  with  elements  SUJ .  Call 
the  constant  matrices  G  and  S  the  transmitting  and  receiving 
matrix  repectively.  Then  with  x  and  x  the  vector  of  the  sig¬ 
nals  Xi  and  repectively,  v,  the  blocked  version  of  v(n), 
one  has  the  equivalent  system  in  fi  g.  2.  Here  the  pseudocir- 
culant  matrix  G(z)  [9],  is  formed  by  the  coeffi  cientsof  the 
FIR  channel  C(z)  =  Cq  +  C\Z~X  +  . . .  +  cKz~K.  It  obeys: 

C (z)  =  [  Co  C  1(z)  ]  (2.1) 

where  C0  is  constant,  N  x  M,  and  Ci (z)  is  N  x  k.  Note 
the  knowledge  of  the  autocorrelation  of  v,  yields  the  auto¬ 
correlation  matrix  of  v. 


Fig.  2.  Polyphase  representation  of  the  DMT  system. 


For  DMT  systems  using  zero  padding,  the  transmitting 
and  receiving  matrices  are  respectively  given  by 

S  =  r-1  [  W  W0]  (2.2) 

where  W  is  the  M  x  M  unitary  DFT  matrix  with  [TU];,™  = 
l,m  =  0 -  1,  W0  is  the  M  x  k 
submatrix  of  W  having  the  fi  rsttc  columns  of  W,  and  F  is  the 
M  x  M  diagonal  matrix  with  elements  that  are  the  M -point 
DFTs  of  the  channel  impulse  response,  [1],  We  consider 
more  general  DMT  systems  that  can  lead  to  reduction  in 
sidelobes  and  better  noise  rejection  properties  of  the  fi  Iters. 
The  transmitting  matrix  of  such  a  general  DMT  is  given  by 


Using  singular  value  decomposition,  Cq  can  be  written  as 


Co  =  [  U0  b\ 

' - v - 

u 


A 

0 


VT  =  U0AVt 


(2.6) 


where  U  and  V  are  respectively  N  x  N  and  M  x  M  unitary 
matrices  whose  columns  are  the  eigenvectors  of  C0C0T  and 
CoTC0.  A  is  the  M  x  M  diagonal  matrix  with  diagonal 
elements  that  are  the  singular  values  of  Co . 

Using  (2.6),  one  clear  choice  for  S  satisfying  (2.5)  is 

S  =  GlVA~lU^  (2.7) 


2.2.  Problem  defi  nition 


The  optimum  bit  loading  problem  is  to  fi  ndthe  best  bit  rate 
allocation  scheme  to  minimize  the  transmit  power,  under 
different  bit  rate  and  SER  budgets  of  the  users.  The  optimal 
transceiver  is  then  designed  to  minimize  the  power  subject 
to  optimum  bit  loading. 

Assume  there  are  r  users,  with  each  user  being  allocated 
L  subbands  (in  fi  g.  1,  M  =  rL  and  N  =  M  +  k).  Let 
the  input  power  in  the  j-th  subband  of  the  A-th  user  be 
< k .  Due  to  PR,  this  is  also  the  output  signal  power  a\  ; 
in  the  j-th  subband  of  the  k-th  user.  Let  the  output  noise 
power  in  this  subband  be  a \  k,  and  b.,j;  be  the  number 
of  bits  allocated  in  this  subchannel.  Due  to  different  QoS 
requirements,  we  may  have  different  bit  rate  constraints  for 
the  users.  The  average  number  of  bits  for  the  k-th  user  is 
bk  =  x  EjE)1  bj,k  ■  However  we  need  to  account  for  the 
reduction  in  bit  rate  due  to  the  zero  padding.  The  average  bit 
budget  for  the  k-th  user  is  then  tk  =  j^bk  =  jt  EjE o  bj,k- 

With  a  high  bit  rate  assumption  made  on  the  modulation 
system,  we  have,  [5],  for  the  A-th  user 


a 


2 

xj ,  k 


ck  22b 


j  ,k 


a 


2 

ej  ,k 


where  the  constant  ck  depends  on  the  SER  r)k .  We  seek  to 
minimize  the  average  transmission  power  given  by 


/ 


1 

M 


r  L—l 

k= 1  j= 0 


l 

M 


k= 1  j=  0 


(2.8) 


G  = 


(2.9) 


subject  to  the  bit  rate  budgets 


4.  OPTIMUM  TRANSCEIVER  DESIGN 


tk  =  k  =  1, .  (2.10) 

V  j=0 

and  the  PR  requirement  (2.7). 

3.  OPTIMUM  BIT  ALLOCATION 


In  this  Section  we  address  the  problem  of  fi  lterselection  to 
minimize  (3.15).  This  reduces  to  selecting  a  unitary  matrix 
Gq.  Given  that  the  matrices  in  (2.6)  are  known,  (2.7)  fi  tes 
S.  Observe  that  the  situation  in  fi  g.  3  prevails,  and  R^,  the 
autocorrelation  of  e,  is  known.  Further  the  autocorrelation 
matrix  of  e  is  given  by 


The  problem  of  minimizing  (2.9)  under  the  set  of  constraints 


(2.10)  is  a  constrained  optimization  problem. 
AM-GM1  inequality  and  (2.10), 

Using  the 

k=  1  j= 0 

(3.11) 

fc=l  j= 0 

(3.12) 

k=  1  j=0 

(3.13) 

with  equality  holding  iff  for  all  j,  k: 

N  1  L~ '}  1 

b0,k  =  jjtk  +  -log2(  J]  cr2ej  k)1/L--log2(<72ej 

7=0 

J.  (3.14) 

This  is  the  optimum  bit  allocation  strategy.  The  optimal 
transceiver  design  is  to  fi  ndmatrices  S,  G  so  as  to  minimize 

r  L  —  l 

j = n  ao,kt/L 

k—1  j—0 

(3.15) 

where 

ak=ck  22Ntk  a0)k  =  o2ejk. 

(3.16) 

Observe,  if  one  chooses  L  =  2,  and  ak  =  a  for  all  k,  then 
(3.15)  reduces  to  the  optimization  function  considered  in 
[2],  for  the  subband  coding  of  cyclostationary  signals. 

Optimal  arrangement:  Observe,  [2, 6],  that  given  a  set  of 
positive  numbers  dk  >  <5fc+i  the  minimum  among 

all  possible  ^2Ski6kj  is  Yjk= t  ^khi-k+i-  Thus  among  the 
various  permutations  of  aj_k,  any  that  minimizes  (3.15)  must 
have  the  following  property: 

L  —  l  L  —  l 

®m,k i  dn,k 2  t^ki  1 1  tlj,kl  5^  tXk2  ^ |  tlj^k 2  (3.17) 

j^m  ji^n 

and 

L  —  l  L—l 

O-m  ^  ®-n  ®j,m  ^  (3.18) 

j= 0  j—0 

1  The  arithmetic  mean  (AM)  of  a  set  of  positive  numbers  is  greater  than 
or  equal  to  their  geometric  mean  (GM),  with  equality  iff  all  the  numbers 
are  equal. 


Re—GgReGo,  (4.19) 

and  that  ahk  in  (3.15),  are  simply  the  diagonal  elements  of 

Re- 

We  need  a  few  results  from  the  theory  of  majorization 
that  will  be  used  in  solving  the  optimization  problem  at  hand. 
We  will  fi  rstintroduce  the  notion  of  majorization  and  Schur 
concavity  [6]. 

Defi  nition4.1  Consider  two  sequences  x  =  }''b|  and 

V  =  with  x-i  >  and  yi  >  yi+i-  Then  we  say 

that  y  majorizes  x,  denoted  as  x  -<  y,  if  the  following  holds 
with  equality  at  k  =  n 

k  k 

1  <  k  <n. 

i=  1  i=  1 

Defi  nition4.2  A  real  valuedfunction  <p(z)  =  <j>(zi, . . . ,  zn) 
defi  nedon  a  set  A  C  Rn  is  said  to  be  Schur  concave  on  A  if 

x^y  on  A  =>  fi(x)  >  fi(y). 

f  is  strictly  Schur  concave  on  A  if  strict  inequality  f(x)  > 
f(y)  holds  when  x  is  not  a  permutation  ofy. 


We  will  now  state  a  theorem  that  results  in  a  test  for  strict 
Schur  concavity.  We  denote 


4>{k){z) 


dfjz) 

dzk 


and 


d2ct>(z ) 

dzidzj 


Theorem  4.1  Let  <p(z)  be  a  scalar  real  valued  function  de¬ 
fi  nedand  continuous  on  T>,  and  twice  differentiable  on  the 
interior  ofT>.  Then  f(z)  is  strictly  Schur  concave  on  T>  iff 

(i)  f  is  symmetric  in  its  arguments, 

(ii)  <f^(z)  is  increasing  in  k,  and 

(iii)  4>(k-){z)  =  <t>(k+i){z)  =£•  f(k,k){z)  ~  4>(k,k+i ){z)  ~ 

0(fe+ i,fe) (z)  +  <^(fc+i,fc+i)  (z)  <  0. 


Theorem  4.2  If  H  is  an  n  x  n  hermitian  matrix  with  diag¬ 
onal  elements  hi, . . . ,  hn  and  eigenvalues  Ai, . . . ,  A^,  then 


h  -<  A  on  Rn. 


To  connect  the  results  from  majorization  theory  devel¬ 
oped  to  our  optimization  problem,  we  state  the  following 
lemma. 


Fig.  3.  Receiver  block  diagram. 


Theorem  4.3  The  real  valued  scalar  function  J  as  defi  ned 
in  (3.15)  under  the  optimality  conditions  ( 3.17-3.18)  is  strictly 
Schur  concave. 

In  particular  as  the  search  of  Gq  is  restricted  to  unitary 
matrices,  if  one  chooses  Gq  to  be  H  a  matrix  of  orthonormal 
eigenvectors  of  Rg,  then  Re  is  a  diagonal  matrix  containing 
the  eigenvalues  of  Rg .  Note  that  diagonal  elements  of  It, 
are  a\  k .  Thus  from  theorem  4.2,  this  choice  of  Go  yields 
a  sequence  of  f  that  majorizes  all  other  achievable  se¬ 
quences.  Consequently  if  arranged  optimally.  Theorem  4.3 
holds,  that  such  a  sequence  will  minimize  (3.15).  It  remains 
simply  to  arrange  the  eigenvalues  of  Rg  among  the  a\  k , 
through  exhaustive  search  if  need  be,  so  that  an  arrange¬ 
ment  that  minimizes  (3.15)  is  obtained.  Thus  for  a  suitable 
permuation  matrix,  P,  the  optimizing  Go  is 

G0  =  Pi).  (4.20) 

5.  SIMULATION  RESULTS 

In  this  section,  we  compare  the  trasmitting  power  of  the 
DFT  based  DMT  under  no  bit  allocation  and  optimum  bit 
allocation  with  our  optimum  transceiver.  We  assume  the 
channel  to  be  C(z)  =  1  +  0.5.Z-1,  and  a  noise  source  v(n) 
whose  power  spectral  density  is  shown  in  fi  g.  4.  The  plot 
shows  that  there  is  an  8  dB  saving  in  transmit  power  with 
our  design  over  the  DFT  based  DMT  under  optimum  bit 
allocation,  and  a  12  dB  improvement  over  the  conventional 
DMT  with  no  optimum  bit  allocation.  We  however  note  that 
there  may  exist  noise  environments  where  the  DFT  based 
DMT  performs  as  well  as  our  optimal  design. 

6.  CONCLUSIONS 

In  this  paper,  we  have  presented  an  optimum  bit  allocation 
strategy  and  transceiver  design  for  minimizing  the  transmit 
power  when  different  users  have  varied  QoS  requirements. 
Simulations  confi  rmthe  effi  cacy  of  our  results. 
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1  Introduction 


The  discrete  multitone  (DMT),  modulation  channel  coding  scheme,  also  known  as 
Orthogonal  Frequency  Division  Multiplexed  (OFDM)  system,  has  established  itself 
as  an  effective  high  rate  data  communication  technique  in  both  wired  and  wireless 
environments.  It  is  used  for  example  in  high  speed  ADSL  and  HDSL  modems,  [3], 
[16]  and  has  been  proposed  as  the  modulation  scheme  of  choice  in  the  Mill  Bahama 
and  Magic  Wand  wireless  ATM  systems,  [4]  as  well  as  in  the  IEEE  802.11a  wireless 
standard.  We  consider  DMT  in  a  multiuser  environment,  specifically  when  a  single 
DMT  system  simultaneously  supports  multiple  quality  of  service  (QoS)  provisioned 
flows.  The  various  flows  may  represent  different  multimedia  services  such  as  data, 
speech,  and  video,  each  endowed  with  different  QoS  requirements  quantified  in  this 
paper  by  bit  rates  and  symbol  error  rates  (SER).  Our  goal  is  to  characterize  optimal 
DMT  systems,  employing  zero  padding  redundancy  that  achieve  these  multiple  QoS 
specifications,  with  the  minimum  transmission  powTer. 

We  are  motivated  by  the  knowledge  that  future  broadband  networks  wall  be  ex¬ 
pected  to  provide  a  wide  range  of  multimedia  services.  Thus,  even  wireless  networks 
must  support  video  conferencing,  voice,  and  data,  with  different  end-to-end  QoS  re¬ 
quirements  at  data  rates  that  can  be  several  orders  (e.g.  4G  in  IMT2000)  higher 
than  today’s  second-generation  (2G)  systems  [1]  [2] .  Thus  the  same  OFDM  channel 
in  such  future  systems  will  be  called  upon  to  deliver  data  flows  wfith  multiple  QoS 
specifications.  Since  power  conservation  is  important,  data  transfer  must  occur  at 
the  smallest  level  of  permissible  power. 

Figure  1  depicts  the  broad  contours  of  DMT  communication  systems.  The  basic 
idea  in  this  multicarrier  technique  is  to  partition  the  dispersive  transmission  channel 
into  a  large  number  of  parallel  independent  subchannels  by  applying  an  orthogonal 
block  transform.  Specifically  the  incoming  data  stream  is  converted  into  M-parallel 
data  streams  each  operating,  at  a  rate  that  is  M-times  smaller  than  the  original 
symbol  rate,  and  each  having  a  distinct  carrier.  An  M-point  block  orthogonal  trans¬ 
formation  of  these  streams  of  data  is  followed  by  a  parallel  to  serial  conversion,  prior 


2 


to  transmission  through  the  communication  channel.  Typically  for  an  FIR  channel 
of  length  k.  extra  redundancy  of  length  k  is  added  at  the  channel  input  to  infuse 
resistance  to  channel  induced  ISI.  Consequently,  the  effective  rate  reduction  in  each 
subchannel  is  by  a  factor  N,  where  N  —  M  +  k.  The  equalizer  in  fig.  1  is  used  to 
keep  the  effective  channel  length,  k.  small.  The  fact  that  each  data  stream  operates 
at  a  slower  rate  reduces  the  dispersive  channel  effects  it  experiences.  At  the  channel 
output  one  performs  in  succession  the  operations  of  redundancy  removal,  parallel  to 
serial  conversion,  and  the  application  of  an  inverse  block  transform.  In  traditional 
OFDM,  the  input  transform  is  an  Inverse  Discrete  Fourier  Transform  (IDFT)  oper¬ 
ation,  and  the  output  transformation  is  a  block  DFT  operation.  The  redundancy  at 
the  channel  input  in  the  standard  OFDM  is  a  cyclic  prefix.  Recently  several  authors 
have  proposed  more  general  orthogonal  block  transforms,  and  the  injection  of  zero 
padding  redundancy,  [8],  [11],  [7],  [15]  leading  to  the  so  called  Generalized  DMT  sys¬ 
tems.  It  is  such  a  zero-padding  generalized  DMT  system  that  is  the  subject  of  this 
paper. 
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Figure  1:  The  DMT  system. 


The  overall  system  can  be  captured  by  the  discrete  time  baseband  model  described 
in  fig.  2.  The  transmitting  filters,  F'k(z),  and  the  receiving  filters,  Hk(z ),  model  the 
transformation,  and  redundancy  injection  and  removal  operations,  and  have  length 
N  each.  In  conventional  OFDM,  the  coefficients  of  Fk(z)  and  Hk(z)1  are  respectively, 
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related  to  M-point  Inverse  DFT  (IDFT)  and  DFT  coefficients.  In  this  figure  the 
channel  equalizer  combination  is  assumed  to  have  an  equivalent  discrete  time  FIR 
transfer  function  C(z)  of  order  k.  The  signal  v(k)  models  the  effect  of  the  noise  and 
interference  experienced  at  the  equalizer  output. 

v(k) 


Figure  2:  Filter  bank  based  DMT  model. 


Several  authors  have  studied  the  optimum  trasceiver  design  in  the  single  user  case 
[19],  [11],  [10],  [22],  [13].  While  [11],  [10]  and  [22]  are  concerned  with  optimizing 
the  transmitted  power,  [19]  focusses  on  the  maximization  of  the  mutual  information 
between  the  trasmitted  and  received  signals.  We  explain  the  underlying  concept 
of  the  optimization  procedure  by  taking  the  example  of  [11],  as  it  most  directly 
influences  the  development  of  this  paper.  Specifically,  [11],  assumes  no  equalizer, 
zero  padding  redundancy,  known  channel,  and  noise  v(n)  of  known  power  spectral 
density  (psd)  modelling  co-channel  interference.  This  is  partially  motivated  by  the 
fact  that  in  HDSL  applications,  the  dominant  interference  is  generated  by  near  end 
cross  talk  (NEXT),  [3].  Assuming  a  single  user  framework,  subject  to  a  specified 
target  SER,  average  bit  rate,  [11]  seeks  to  optimize  against  the  transceiver  structure, 
i.e.  the  transmit  and  receive  filters,  and  the  number  of  bits/symbol  assigned  to  each 
subchannel,  to  minimize  the  transmitted  power.  This  is  done  under  the  assumption 
of  perfect  reconstruction  (PR),  i.e.  that  the  transceiver  output  equals  the  transceiver 
input  in  the  noise  free  case,  an  orthogonality  condition  on  the  transmitter,  and  the 
use  of  zero-padding  redundancy.  These  ideas  are  extended  in  [12]  to  more  general 
settings. 
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The  approach  of  [11]  can  be  best  characterized  as  water-pouring.  Suppose  a  given 
set  of  subchannels  see  deeper  channel  nulls  or  experience  higher  levels  of  cochannel 
interference.  Then  the  bit  loading  scheme  must  assign  fewer  bits/symbols  to  these 
subchannels.  To  avoid  too  many  of  such  “low  performing”  subchannels,  the  subchan¬ 
nel  selection  process  must  try  to  squeeze  out  problematic  frequency  bands  with  more 
adverse  conditions  as  best  as  one  can,  specifically  by  forcing  channel  nulls  or  interfer¬ 
ence  spectral  peaks,  to  occupy  as  few  subchannels  as  possible.  In  essence,  [11],  [12] 
develop  formalisms  that  capture  these  tasks. 

In  this  paper,  we  extend  the  notions  developed  in  [11],  to  a  multiuser  environment. 
Specifically  we  assume  that  there  are  n-users  with  each  assigned  M/n  subchannels. 
Further  the  fc-th  user  requires  a  bit  rate  of  /y,  and  an  SER  of  no  more  than  ry. . 
Our  goal  is  to  select  filters  F'k(z)  and  Hk(z),  and  distribute  the  bit  rates  among  the 
various  sub-channels,  to  achieve  the  above  specifications  with  the  minimum  possible 
transmitted  power.  As  in  [11]  we  assume  that  a  zero-padding  redundancy  is  employed 
and  that  the  transmitter  satisfies  an  orthogonality  condition. 

Past  treatment  of  optimum  resource  allocation  in  a  multiuser  setting,  is  restricted 
mostly  to  bit  loading  algorithms.  Loading  algorithms  using  efficient  table  lookups 
and  a  fast  Lagrange  bisection  search  method  for  a  single  user  setting  are  developed 
in  [10].  [6]  considers  a  water-filling  approach  to  the  bit  loading  problem  when  two 
users  are  present.  Other  related  papers  are  [24],  [26],  [27],  each  of  which  provide 
algorithms  for  bit  and  power  allocation  under  a  multiuser  setting.  None  considers 
filter  optimization. 

Section  2  ties  the  DMT  model  of  fig.  1  to  the  filter  bank  model  of  fig.  2  It  also 
presents  a  polyphase  representation  of  the  DMT  system  that  facilitates  the  optimiza¬ 
tion  task  central  to  this  paper.  Section  3  converts  the  power  minimization  problem 
to  a  precise  optimization  problem.  Sections  4  and  5  consider  bit  loading  and  fil¬ 
ter  selection  respectively.  Section  6  provides  numerical  examples.  Section  7  is  the 
conclusion. 

Notation:  In  the  sequel,  the  superscripts  (,)T,(.)'ff  will  stand  for  the  transpose, 
and  Hermitian  transpose  respectively.  IM  will  denote  the  M  x  M  identity  matrix. 
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2  DMT  based  multiuser  system  model 

In  this  Section  we  consider  the  generalized  DMT  structure  of  [11],  and  tie  it  to  the 
filter  bank  structure  of  fig.  2,  and  a  polyphase  representation. 

2.1  Generalized  DMT  with  zero  padding  redundancy 

Consider  a  block  of  M  samples  of  w(k)  in  fig.  1 ,  i.e. 

[w{Mk),  w(Mk  -  1), ... ,  w(Mk  -  M  +  1)]T  . 

Then  the  redundancy  injection  process  converts  this  to  an  iV-block  signal  by  ap¬ 
pending  each  M-block  by  additional  k  zeros  to  obtain  the  IV-block  below,  prior  to 
transmission. 

[w(Mk),  w(Mk  —  1), . . . ,  w(Mk  —  M  +  1),  0, . . . ,  0] , 

Thus,  writh 

lzv=  Im  ,  (2.1) 

OkxM 

1T  1T 

s(Nk)  s(Nk  —  1)  •••  s(Nk-N  +  l)  =  Izv  iv(Mk)  w(Mk  -  1)  •••  w(Mk  -  M  +  1) 

(2.2) 

The  redundancy  removal  operation  is  a  general  linear  operation.  Specifically  with  Si 
a  suitable  M  x  N  matrix, 

1T  1T 

p(Nk)  p(Nk-l)  ■■■  p(Nk-N  +  1)  =  Si  r(Mk)  r(Mk  —  1)  •••  r(Mk  -  M  +  1)  . 

(2.3) 

Dehne  in  fig.  1,  Go  and  So  as  the  M  x  M  transmitter  and  receiver  transform 
matrices  respectively.  In  the  sequel,  as  in  [11],  we  assume  Go  is  unitary,  i.e., 

Gq  G0  =  /.  (2.4) 

Define 

1 T 

x(k)  =  x0{k).  x i ( k ) .  . . . ,  xM-i(k)  ,  (2.5) 
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and 


x(k)  -  a\){k),  X\{k).  . . . ,  xM-i(k)  ■  (2.6) 

Define  also  the  iV-fold  blocked  version  of  the  channel-equalizer  combination  C(z)  — 
cq  +  ciz-1  +  . . .  +  ckz~k  to  be 

c0  Z~lcN- 1  . . .  ^_1ci 

ci  c0  :  'c2 

ClV-l  Cjv_  2  ...  Co 

with  Ci  =  0,  for  all  k  <  i  <  N .  The  above  matrix  can  be  written  as: 

C(.~)  =  [  CL  CR(z)  ]  (2.8) 

where  Cl  is  an  iV  x  M,  constant  matrix,  and  Cr(z)  is  N  x  k. 

With  v(k)  the  noise  and  interference  effect  at  the  output  of  C(z),  define 

1T 

v(k)=  v{Nk),  v(Nk-  1),  ,...,  v(Nk  -  N  +  1)  (2.9) 

a  the  Wfold  blocked  version  of  v(k).  Then  one  has 

x(k)  =  S0S1CljG0x(k)  +  SoSxvik).  (2.10) 

2.2  Polyphase  representation  and  perfect  reconstruction 

All  three  structures  discussed  can  be  viewed  as  being  represented  by  the  scheme  in 
fig.  3,  wrhere  the  N  x  M  matrix  G  and  M  x  N  matrix  S  are  given  by 

S  —  S0S1,  G  =  TZVG{].  (2.11) 

Since  the  output  of  G  and  the  input  to  S  in  fig.  3  are  respectively,  the  iV-fold  blocked 
versions  of  the  channel  input  and  the  equalizer  output  (see  fig.  1),  one  has,  [21],  the 
familiar  transmultiplexer  structure  of  fig.  2,  where  writh 

N- 1  ft- 1 

Fk(z)  =  J2  z ~’Ghk ,  Hk(z)  =  (2-12) 

8=0  8=0 
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V 


Figure  3:  Polyphase  representation  of  the  DMT  system. 


_  r  x-y  I  TV— 1,M  — 1  Q  r  q  i  M— 1,7V— 1 

U  —  [^i,/cji=0,/c=0  5  °  —  Pk,i\k=0,i=0  • 

Observe,  Fk(z)  are  each  of  degree  M  —  1,  and  Hk(z)  have  degree  N  —  1 . 
We  impose  the  perfect  reconstruction  (PR)  condition: 

SC(.z)G  =  /, 

i.e.,  in  the  absence  of  noise/interference 


(2.13) 


(2.14) 


x(k)  —  x_(k),  for  all  k. 

In  other  words,  the  DMT  system  has  no  Inter  Symbol  Interference  (ISI). 

To  obtain  a  more  useful  characterization  of  PR,  consider  the  singular  value  de¬ 
composition  of  Cl  defined  in  (2.8): 


CT,  =  U 


A 

0 


V"  =  UnAV 


H 


(2.15) 


where  U  and  V  are  respectively  N  x  N  and  M  x  M  unitary  matrices  wThose  columns 
are  the  eigenvectors  of  ClCl^  and  Cl^Cl-  A  is  the  M  x  M  real,  positive  definite 
diagonal  matrix  with  diagonal  elements  that  are  the  singular  values  of  Cl-  Then, 
because  of  (2.10),  given  Go,  the  class  of  all  S  enforcing  PR  is  completely  characterized 
by 


H-ITK  —1 


S’  =  So  A,  =  G"VA 


A 


UH , 


(2.16) 


wrhere  A  is  any  arbitrary  M  x  k  matrix.  In  the  sequel  it  will  be  useful  to  pratition  U 


as 


U  = 


U0  U\ 


(2.17) 


where  Uq  is  N  x  M  and  Uq  is  N  x  k. 
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Filter  selection  for  power  minimization  subject  to  optimum  bit  rate  allocation  will 
be  performed  by  optimizing  against  M  x  M  unitary  Go  and  the  arbitrary  matrix  A. 
Once  Go  is  found  (2.11)  provides  G,  just  as  Go  and  A  together  with  the  knowledge 
of  Cl,  provides  S  through  (2.16).  The  filters  Fk,Hk  can  then  be  found  using  (2.12) 
and  (2.13). 


3  Power  Minimization:  Formulation 


We  nowr  give  a  precise  mathematical  formulation  of  the  multiuser  power  minimization 
problem  defined  broadly  in  the  Introduction.  To  recount,  n  users  are  assigned  L 
subchannels  each;  the  precise  channel  assignments  will  emerge  from  the  optimization 
process.  The  A;-th  user  must  meet  a  maximum  SER  of  rjk,  and  must  maintain  a  bit 
rate  &*..  Thus,  assuming  that  bj ^  is  the  bit  rate  sustained  in  the  j-th  subchannel  of 
the  A;-th  user,  one  must  have  for  all  1  <  k  <  n. 


=  (3-18) 

v  j= o 

The  goal  is  to  assign  bit  rates  among  the  various  subchannels  and  select  the  filters 
F'k{z)  and  Hk(z )  so  that  the  target  SER  is  met  with  the  minimum  collective  trans¬ 
mitter  power,  subject  to  PR,  constraint  (3.18),  and  the  orthonormality  conditions 
particular  to  the  scheme. 

We  do  not  require  each  user  to  employ  the  same  modulation  scheme.  Thus  the 
problem  considered  here  generalizes  [11]  in  several  important  respects:  (A)  It  has  to 
contend  with  n  separate  bit  rate  budgets  (3.18)  as  opposed  to  a  single  budget  in  [11], 
(B)  Separate  modulation  schemes  are  permitted  for  different  users.  (C)  Different 
users  may  have  different  SER  requirements. 

To  achieve  a  given  SER,  most  modulation  schemes,  with  6-bit  symbol  constella¬ 
tions,  6  large,  require  an  output  SNR  of  d2h  with  d  >  0  determined  by  the  SER  and 
the  modulation  scheme.  Thus  for  example,  for  a  6-bit  square  QAM  the  SER  is  given 
by 


rj  —  4 


f  I  3SNR  \ 

\V  (2('  —  1)/ 


~  4  Q 


1 3SNR\ 

V~2 ~)  ’ 


when  2b  1. 


(3.19) 
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where 


Q(a)  =  J 


2!2d. 


X. 


Thus  for  large  b,  SNR=  d2b  with  d  —  |[Q  1(f)]2  >  0.  Consequently,  since  under 
the  PR  condition  the  output  signal  power  equals  the  input  signal  power,  in  the  j-th 
channel  of  the  A;-th  user 


<,  (3.20) 

where  a 2  is  the  output  noise  variance  in  this  subchannel  and  <4  is  a  constant  de- 

cj,k 

termined  by  the  SER  and  modulation  scheme  used  for  the  k-th  user.  Because  Go 
is  unitary,  in  the  DMT  system  with  zero  padding  redundancy,  the  total  transmitted 
power  equals 

n  L— 1  n  L— 1 

E  E  <,  =  E  E  (3.2i) 

k= 1 j= 0  k= 1 j= 0 

Now  observe  that  a2  k  are  the  diagonal  elements  of  the  output  noise  autocorrela¬ 
tion  matrix  Re.  We  note  that 

Re  =  GqRGq  (3.22) 


where  because  of  (2.16) 


R  =  VAT1 


Im  A 


K 

uf 


R, 


U0  Gi 


Im 

Ah 


-It  /-H 


A  R 


(3.23) 


Here  R„  is  the  known  autocorrelation  matrix  of  v(k)  and  the  knowledge  of  the  channel 
equalizer  combination  provides  A,  Ui  and  V  from  the  SYD  of  Cl- 

We  then  have  the  following  formal  statement  of  the  problem  to  be  solved 


Problem  3.1  Given  dk,bk,M  x  M  positive  definite,  Hermitian  symmetric ,  R„ ,  find 
bj^ ,  M  x  k,  A  and  M  x  M  unitary  matrix  Go  such  that  with  a2,  the  diagonal  elements 
of  Re  as  in  (3.22)  and  (3.23),  (3.21)  is  minimized  under  (2-4)  and  (3.18). 


We  will  adopt  a  three  step  approach  to  solving  this  problem. 

•  Step  1:  Subject  to  a  given  choice  of  Go  and  A,  find  the  optimum  bit  rate 
allocations  bj^. 
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•  Step  II:  Given  A  find  the  optimizing  Go- 


•  Step  III:  Find  the  optimizing  A 

Such  a  separation  becomes  possible  due  to  the  following  considerations.  Observe 
that  afk  are  simply  the  diagonal  elements  of  Re.  Consequently  they  only  depend 
on  Go  and  A  and  are  independent  of  the  selected  bj Further,  given  any  choice  of 
&ejki  optimum  bit  rate  allocations  bj ^  produced  by  Step  I,  transforms  (3.21)  to 
an  expression  that  is  in  fact  independent  of  bj^  and  dependent  only  on  v'ik-  in  turn 
determined  by  Go  and  A  and  /y.  and  A  supplied  by  the  problem  specification.  Thus 
Step  I  can  be  conceptually  separated  from  the  selection  of  Go  and  A 

Similarly,  regardless  of  the  choice  of  A  Go  yielded  by  Step  II  reduces  (3.21)  to  an 
expression  that  is  entirely  determined  by  the  eigenvalues  of  the  matrix  R  in  (3.23). 
These  eigenvalues  are  of  course  independent  of  Go  and  determined  exclusively  by  A. 
Thus  Step  III  need  only  find  the  A  that  renders  these  eigenvalues  to  be  the  most 
favorable.  Thus  indeed  the  separation  above  is  justified. 


4  Optimum  bit  rate  allocation 


The  two  components  to  the  optimization  problems  considered  in  Section  3  are  opti¬ 
mum  bit  rate  allocation,  i.e.  selection  of  the  bj and  filter  selection.  In  this  Section 
we  ask:  Given  certain  <7^  ,  how  does  one  allocate  bit  rates  bj ^  to  minimmize  (3.21)? 
The  problem  of  minimizing  (3.21)  under  the  set  of  constraints  (3.18)  is  a  constrained 
optimization  problem.  Using  the  AM-GM  inequality,  which  states  that  the  arithmetic 
mean  (AM)  of  a  set  of  positive  numbers  is  greater  than  or  equal  to  their  geometric 
mean  (GM)  with  equality  iff  all  the  numbers  are  equal,  and  (3.18), 

n  L— 1  n  L— 1 


£  £  42-' 

k=  1  j= 0 


> 


k= 1  j= 0 

n  L— 1 

(4.24) 

i£4(  2M*n  <a 

k= 1  j= 0 

(4.25) 

with  equality  holding  iff  for  all  i.j.  hr. 


2,ko;  -  2  l’K  a:  . 
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This  in  turn  requires  that  for  all  j.  k: 

N  1  1-1  1 

bhk  =  j-bk  +  2  lo§2(  n  G%y/,J  -  2  loS2(^,*)-  (4-26) 

This  is  the  optimum  bit  allocation  strategy.  Note  that  as  in  [11]  this  bit  allocation 
strategy  does  not  impose  the  obvious  requirement  that  the  bhk.  be  positive  integers. 
Nonetheless  by  suitable  rounding  off  it  represents  a  good  approximation  of  the  at¬ 
tainable  minimum. 

The  optimal  transceiver  design  is  to  find  matrices  A  and  Go  so  as  to  minimize 

n  L— 1 

J  Iiaj,k)1/L  (4-27) 

k= 1  j= 0 

where 

ak  =  dLk 2Nbk  >  0,  ahk:  =  4i  k  >  0.  (4.28) 

Observe  that  this  cost  function  depends  only  on  bk,  dk  and  4  ,  and  not  on  the  the 

particular  selection  of  b^k ,  reinforcing  the  point  made  at  the  conclusion  of  the  previous 
section. 

We  note  that  the  setting  in  [11]  considers  minimization  of  the  cost  function 

IK*-  (4-29> 

j,k 

The  altered  nature  of  the  cost  function  underlying  the  multiuser  case  of  this  paper, 
makes  the  extension  to  this  setting  nontrivial. 

Given  a  set  of  aj o-k  in  (4.27)  it  behooves  us  to  ask  the  following  question: 

Which  ordering  of  a,^k .  ak  leads  to  the  smallest  value  of  J?  To  answer,  we  provide 

the  following  extension  of  a  result  by  Hardy  and  Polya,  [14]. 

Lemma  4.1  Given  a  set  of  positive  numbers  { 8  k  \k= , .  with  8k  >  8k+ 1-  Consider 
8kl  bk&  with  6ki:fikt  distinct  for  all  k.  Then  this  quantity  attains  its  minimum 
value  iff  whenever  for  some  i  E  {1,2},  8ki  A  <*+  then  8kj  <  7/( ,  i  ^  j. 

Proof:  Sufficiency  is  in  [14],  Suppose  without  loss  of  generality  that  8n  >  8-21  and 
812  >  822 ■  Then 

811812+821822— (^11^22+^21^12)  =  811(812— 822)— 821(812— 822)  —  (8ll—82l)(8i2—822)  >  0. 

(4.30) 
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Hence  the  result. 


Thus  the  largest  St  must  be  paired  with  the  smallest  63.  the  second  largest  with 
the  second  smallest,  etc.  Thus,  given  a*,,  k  €  {1,2, . . .  ,  T},  among  the  various  per¬ 
mutations  of  any  permutation  that  minimizes  (4.27)  must  have  the  following 

property: 

L—l  L- 1 

@‘n,k2  1 1  ®j,ki  —  ®&2  1 1  (4-31) 

j  =  l,j=£m.  j-ljjtn 


and 


L—l  L—l 

am  A  Q! n  1 1  CL  j,m  T  1 1  CL j,n  ■  (4.32) 

i=0  j=0 


5  Optimum  transceiver  design 

In  this  Section  we  address  the  problem  of  filter  selection  to  minimize  (4.27).  Specifi¬ 
cally  we  must  find  an  M  x  M  unitary  Go  and  an  M  x  k  A  to  minimize  (4.27)  under 
(4.31,4.32)  and  (3.22,  3.23),  where  aj ^  are  the  diagonal  elements  of  Re. 

Much  of  the  development  in  this  section  exploits  elements  from  the  theory  of 
Majorization,  [14],  Section  5.1  provides  a  quick  primer.  Section  5.2  considers  the 
selection  of  Go  given  and  R  in  (3.23),  i.e.  given  A.  Section  5.3  characterizes  the 
optimum  A.  Section  5.4  consolidates  these  results. 


5.1  Majorization  and  Schur  Concavity 

We  first  introduce  the  notion  of  majorization,  [14], 

Definition  5.1  Consider  two  sequences  x  =  {.<•,})  |  and  y  —  {yt }(=l  with  ay  >  ay+i 
and  yi  >  y,:+i.  Then  we  say  that  y  majorizes  x,  denoted  as  x  -<  y,  if  the  following 
holds  with  equality  at  k  —  l 

k  k 

1  <k<l. 

i=l  i= 1 

We  say  that  y  weakly  super  majorizes  x,  i.e.  x  -<w  y  if 

i  i 

^Xi>^yi,  1  <j<l. 
i=j  i=j 
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If  y  majorizes  x,  then  any  permutation  of  y  also  majorizes  any  permutation  of  x. 
The  following  facts  are  self  evident. 

Fact  1  If  x  -<  y  then  x  -< y. 

Fact  2  If  x  -<w  y  and  y  -<w  q  then  x  -< "  q. 

Fact  3  Suppose  a  —  at  >  0,  the  (x  +  a )  -<w  x. 

We  also  have  the  following  Fact  from  [14]. 

Fact  4  Consider  any  M  x  M  Hermitian  matrix  R  with  eigenvalues  Xi  >  A2  >  . . .  > 
A m,  and  an  M  x  M  matrix  Re,  which  obeys  (3.22)  for  an  M  x  M  matrix  Gq.  Then 
the  diagonal  elements  Reii,i  of  Re  obey 

{Re,i,i}ii i  -<  {Ai, . . . ,  Am}.  (5.33) 

We  also  note  the  following  important  result  from  [14]. 

Lemma  5.1  Consider  two  M  x  M  Hermitian  matrices  Q\  and  Q-z-  Suppose  the 
eigenvalues  of  Q i,  Q2  and  Q\  +  Q2  are  respectively  Xt(Q\  ),  Xi(Q-2),  and  A,(Qi  +  Q2), 
X ,■(•)  >  A; 1  ( • ) .  Then 

+  Q2)}iLi  -<  {-^i(Qi)  +  A8(g2)}"i- 

We  now  consider  the  notion  of  Schur  concavity,  [14]. 

Definition  5.2  A  real  valued  function  g{ z )  —  g(z\ . . . . ,  zn)  defined  on  a  set  A  C  R" 
is  said  to  be  Schur  concave  on  A  if 

x  -<  y  on  A  =>  4>{x)  >  <j>(y). 

is  strictly  Schur  concave  on  A  if  strict  inequality  <f>(x)  >  (p(y)  holds  when  x  is  not 
a  permutation  of  y. 


We  will  now  state  a  theorem  that  results  in  a  test  for  strict  Schur  concavity.  We 
denote 


#)(*) 


d<f)(z) 

dzk 


and  <f(i,j)(z) 


d2(f)(z) 

dzidzj 
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Theorem  5.1  Let  o( z )  be  a  scalar  real  valiLed  function  defined  and  continuous  on 
V  —  {(zi, . . .  ,  zn)  :  z\  >  ...  >  zn },  and  twice  differentiable  on  the  interior  ofV. 
Then  (f{z)  is  strictly  Schur  concave  on  V  if 

(i)  <f)(k){z)  is  increasing  in  k, 
and 

(H)  P( /■■)(?)  —  4>(k+ i)(G  =4>  0(fc,fc)(^)  —  4>(k,k+i)(z)  ~  4>(k+i,k)(z )  +  4>(k+i,k+i)(z )  <  0. 

Finally  we  have  the  following  other  important  fact  from  [14], 

Fact  5  Suppose  (j){ z )  satisfies  the  conditions  of  Theorem  5.1.  Then  <f>{x )  >  <f{y) 
whenever  x  -<w  y. 

5.2  Selecting  Go 

Observe  that  by  hxing  A  one  automatically  fixes  R  in  (3.23).  In  this  section  we  solve 
the  following  modified  problem  addressing  step  II  of  the  three  step  procedure  referred 
to  in  the  foregoing. 

Modified  Problem:  Given  an  M  x  M  positive  definite,  Hermitian  symmetric,  R, 
find  an  M  x  M  unitary  G0  to  minimize  (4.27)  under  (4.31,4.32)  and  (3.22),  where 
aj£  are  the  diagonal  elements  of  Re. 

To  this  end  we  have  the  following  pivotal  theorem. 

Theorem  5.2  The  real  valued  scalar  function  J  with  aj:k  as  its  arguments  as  defined 
in  (4-27),  and  ak,aj:k  positive,  is  strictly  Schur  concave  under  the  optimal  arrange¬ 
ment  conditions  (4-31-4-32) . 

Proof:  We  note  that 

(  T  —  1  \  1  /  L 

dJ  1  n?:=0  ®i,k J 

dajk  L  a j  ^ 

Thus  if  ajjk  >  aLk, 

dJ  dJ 

- <  — ■ 

dajik  daiik 
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Now  suppose  for  some  p  m,  0  <  i,  l  <  L  —  1,  at.p  >  a^m.  Then  from  (4.31), 


L—l 


L—l 


a. 


p 


n  aj’P  — arn  n 


*3  ,m 


3  *  -  J  /  * 


3= 1d¥:I 


Consequently 


dj  _  1  (°p  nJ=i,j¥J  aj,p) 


1/1 


3a 


«,p 


l/L 


< 


L  (a^-UL 

1  {fi^m  TI;  aj,m ) 

dJ 


da 


l.m 


Thus  condition  (i)  of  Theorem  5.1  is  met. 
Now  observe 


.  1  //- 


92J  L-l 

3aL  1  L2  ’  (ahk)(2~  VC 


~i,p 


Also  with  j  /  z, 


Finally  with  p  /  m, 


a2j  ,  1  4atnf,o1<‘/,i)1/I  „ 

(72)” - —  >  0- 


ddi^dcij^  L ^ 

d2.J 


d  a  j.pd  ( it.tn 


=  0. 


Thus  (ii)  of  Theorem  5.1  always  holds.  Hence  the  result.  ■ 

We  nowT  use  this  result  to  solve  this  modified  problem. 

Theorem  5.3  Suppose  the  positive  definite  Hermitian  M  x  M  matrix  R  has  eigen¬ 
values  Ai  >  A2  >  . . .  >  A m,  with  corresponding  eigenvectors  ip\,ip2 ,  ■  ■  ■ ,  fi>M  ■  Then  the 
Go  that  solves  the  modified  problem,  above  is  as  below  with  P  a  suitable  permutation 
matrix. 

r  Vf 


Go  —  P 


H 


H 

M 


(5.34) 


Further  the  ahk  that  result  are  simply  these  A,; 
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Proof:  From  Fact  4  and  Theorem  5.1,  the  optimizing  .4  must  have  diagonal  ele¬ 
ments  Ai,  A2, . . . ,  A m-  Thus  the  rows  of  Go  must  be  the  Hermitian  transpose  of  the 
corresponding  eigenvectors.  The  permutation  matrix  P  simply  arranges  these  eigen¬ 
values  in  a  way  to  ensure  that  an  optimum  arrangement  that  minimizes  (4.27),  with 
ahk  —  A,;,  is  attained.  ■ 

In  both  cases,  the  optimizing  matrix  Re  is  diagonal,  i.e.  under  optimality  the 
noise  components  in  the  various  subchannel  outputs  are  mutually  uncorrelated.  The 
optimizing  Go  consists  of  the  eigenvectors  of  17,  and  the  subchannel  output  noise 
variances  are  the  eigenvalues  of  RVl  and  R  respectively.  These  eigenvalues  must  be 
rearranged  between  the  subchannels  to  ensure  that  J  in  (4.27)  is  minimum.  This  will 
in  effect  specify  the  permutation  matrix  P,  which,  however,  may  not  be  unique. 

Most  importantly,  reinforcing  the  arguments  made  in  the  justification  of  the  three 
step  breakdown  of  the  overall  power  minimization  problem,  the  cost  function  (4.27) 
reduces  under  the  optimum  selection  of  Go  to  one  in  which  the  are  the  eigenvalues 
of  R  optimally  arranged.  Consequently  the  selection  of  A  must  be  guided  by  the  need 
to  assign  these  eigenvalues  in  an  optimal  way. 


5.3  Selecting  A 


Since  the  optimizing  Go  results  in  the  cost  function  having  a  value  obtained  by  re¬ 
placing  a,j£  by  suitably  arranged  eigenvalues  of  R  in  (3.23),  Fact  5  and  Theorem  5.2, 
show  that  the  optimmizing  A  must  be  such  that  the  set  of  resulting  eigenvalues  of 
R  must  weakly  super  majorize  all  possible  sets  of  attainable  eigenvalues.  As  V  is 
unitary  in  (3.22),  the  eigenvalues  of  R  are  the  same  as  those  of 


n(A) 


A-1 


Im  A 


K 

u? 


R, 


U0  Ui 


Im 

A" 


A"1 


1 

U'/Rdd 

U"  Rdh 

Im 

Im  A 

AH 

_  Ul'Rdd 

U['  Rdd  _ 

A"1  [U^RaUo  +  Uq  RvU\Ah  +  AU^Rdk  +  AU*R~VUXAH]  A"\5.35) 
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Now  observe  that  as  R$  is  positive  definite,  and  U  — 


U0  U\ 


is  unitary, 


'uf 

Rt 

- 

Uq  RyUo 

Ul'Rdd 

U0 

Cl 

= 

_uf 

_  Ul'Rdd 

Ul'Rdd  _ 

is  positive  definite.  Thus,  the  matrices,  Ul'  Rddi  Uq  R„ Uq  and 

Uq  RvUq  -  ll//  Rdd  (ufRslh)-1  UfRzUo 


(5.36) 


must  each  be  positive  definite  and  nonsingular.  Direct  verification  shows  that  because 
of  (5.35), 

n(A)  =  A"1  {l/^Rdd  ~  U^Rdd  (ul’Rdd)"  U('Rdk 

+  U"  Rdd  {U^R-dhy1  +  A  U?Rdh  A"  +  (i:l'Rd:\)  U('Rdd  }  A-1 

Define 


Qi  —  A 


-l 


U^Rdd  ~  U^Rdd  (u"Rdd)  1  Ui'Rdd 


A 


-i 


(5.37) 


and 


Q-2  —  A 


-l 


U^Rdh  (uj1  RdJ] 


-l 


+  A 


Ul'Rdd 


AH  +  (lI^Rdd)  1  Ul'RdJi 


A 


-i 


(5.38) 


Clearly  Q i  is  positive  definite  and  Q-2  is  positive  semidefinite.  Only  Q2  depends  on 
A.  Defining  A,;(D(A))  as  the  eigenvalues  of  D(A),  from  Lemma  5.1,  and  Fact  1, 


{a,(^))}“i  -<  iHQi) + 

■< w  {WQi)  +  HQ2)}£i- 

Since  A^f^)  >  0,  from  Fact  3 

{Ai(Qi)  +  A;(Q2)}£a  ^  {^(QO}^,  (5.39) 


Thus,  from  Fact  2 

{A,(fi(A))}^  ^  (5.40) 
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Thus  the  optimizing  A  is  one  which  forces  Q2  —  0,  i.e. 


-4  =  -U^IhUi  (ufRiUi) 


(5.41) 


This  is  independent  of  Gq  and  the  optimum  bit  rate  allocations  bij.  Instead  it  is 
determined  exclusively  by  R-,-,.  provided  by  the  second  order  statistics  of  v(k),  and 
Ui  provided  by  the  SVD  of  the  blocked  channel  equalizer  combination  C (z).  The 
resulting  value  of  R  is 


R  =  VA 


-1 


UfRiUo  ~  Uq  RyUi  (ul'RdA)  1  U?RsUo 


-It  rH 


A~  V 


(5.42) 


5.4  Consolidation 

To  summarize,  the  optimizing  A  is  obtained  directly  using  (5.41)  with  the  channel 
characteristics  supplying  Ut  and  the  second  order  statistics  of  v(k)  supplying  R$.  This 
gives  R  from  (5.42).  Go  is  the  provided  by  the  egenvectors  of  R  permuted  so  that 
with  the  eigenvalues  of  R,  an  optimum  arrangement  of  (4.27)  is  attained.  This 
gives  the  requisite  of  k  and  (4.26)  gives  the  optimum  bit  allocations  bj 

It  is  interesting  to  note  that  the  solution  of  A  is  identical  to  that  given  in  [11] 
for  the  single  user  case.  Modulo  the  permutation  required  to  enforce  the  optimum 
rearrangement  requirement,  the  optimizing  Go  is  also  the  same  as  for  the  single  user 
case.  The  only  practical  effect  of  the  permutations  is  to  rearrange  the  rows  of  Go  and 
the  columns  of  S,  i.e.  rearranging  Fj(z)  and  Hi{z).  Thus,  even  though  the  optimum 
bit  rate  allocations  differ  in  the  single  and  multiuser  settings,  the  transceiver  itself  is 
identical. 


6  Numerical  Examples 

We  now  present  two  numerical  examples  illustrating  the  theory  developed  in  the 
previous  sections.  Our  goal  is  to  compare  the  required  transmission  power  in  three 
schemes:  (i)  DFT  based  DMT  under  no  optimum  bit  allocation,  (ii)  DFT  based 
DMT  with  optimum  bit  allocation,  (iii)  Zero  padding  Generalized  DMT. 
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In  all  cases  the  channel/equalizer  combination  has  the  transfer  function  C(z)  — 
1  +  0.25  z-1. 

We  consider  first  a  two  user  case,  with  the  same  SER  requirement  and  modulation 
scheme  for  each  user.  The  target  bit  rate  of  one  user  is  5/3  of  the  other.  Figure  4, 
compares  transmit  powers  of  the  three  schemes  listed  above  (normalized  by  the  same 
constant  for  each  user)  as  the  number  of  channels  per  user  (L)  is  varied.  It  also 
depicts  the  power  spectral  density  (psd)  of  the  noise  at  the  equalizer  output. 


Transmitted  power  under  different  DMT  and  bit  allocation  schemes 


Figure  4:  Noise  psd  and  relative  transmitted  power  levels  for  schemes  (i)-(iii)  in  the 
two  user  case. 

We  note  that  the  Generalized  DMT  provides  roughly  8  dB  savings  in  transmitted 
power  over  scheme  (ii),  and  about  14  db  over  scheme  (i). 

These  experiments  are  repeated  for  a  three  user  case,  with  the  same  SER  require¬ 
ment  and  modulation  scheme  for  each  user.  The  target  bit  rate  ratios  are  2:3:5.  Plots 
in  figure  5,  compare  the  transmit  powers  with  the  curves  for  the  four  schemes  appear¬ 
ing  in  the  same  order.  Generalized  DMT  provides  roughly  10  dB  savings  in  transmit 
power  over  (ii),  and  about  15  db  over  (i). 
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x  io-14  Power  spectral  density  of  noise 


Transmitted  power  under  different  DMT  and  bit  allocation  schemes 


Figure  5:  Noise  psd  and  relative  transmitted  power  levels  for  schemes  (i)-(iii)  in  the 
three  user  case. 

7  Conclusions 

In  this  paper,  we  have  presented  an  optimum  bit  rate  allocation  strategy  and  transceiver 
design  for  minimizing  the  transmitted  power  when  different  users  have  different  QoS 
requirements.  The  underlying  assumption  is  that  while  each  user  is  assigned  the  same 
number  of  subchannels,  the  SER  and  bit  rate  requirements  may  vary  from  user  to  user, 
as  may  the  modulation  scheme.  A  Generalized  DMT  structure  with  zero  padding  re¬ 
dundancy  is  considered.  Simulations  demonstrate  vastly  improved  performance  over 
traditional  DFT  based  DMT  structures.  It  is  shown  that  while  the  optimum  bit  rate 
allocation  strategy  differes  from  the  single  user  case,  the  optimum  transceiver  design 
is  identical  in  both  the  single  and  multiuser  cases.  Our  theory  assumes  that  each  user 
is  assigned  the  same  number  of  subchannels.  A  logical  extension  is  to  consider  the 
setting  where  users  may  be  assigned  different  number  of  subchannels  in  accordance 
with  their  respective  priorities. 
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On  Biorthogonal  Nonuniform  Filter  Banks  and  Tree  Structures 


Ashish  Pandharipande,  Soura  Dasgupta 
January  11,  2002 

Abstract 

This  paper  concerns  biorthogonal  nonuniform  filter  banks.  It  is  shown  that  a  tree  structured  filter 
bank  is  biorthogonal  iff  it  is  equivalent  to  a  tree  structured  filter  bank  whose  matching  constituent  levels 
on  the  analysis  and  synthesis  sides  are  themselves  biorthogonal  pairs.  We  then  show  that  a  stronger 
statement  can  be  made  about  dyadic  filter  banks  in  general:  That  a  dyadic  filter  bank  is  biorthogonal  iff 
both  the  analysis  and  synthesis  banks  can  be  decomposed  into  dyadic  trees.  We  further  show  that  these 
decompositions  are  stability  and  FIR  preserving.  These  results,  derived  for  filter  banks  having  filters 
with  rational  transfer  functions,  thus  extend  some  of  the  earlier  comparable  results  for  orthonormal 
filter  banks. 

Index  Terms:  Biorthogonal,  nonuniform,  dyadic,  filter  banks,  tree  structures. 
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1  Introduction 


This  paper  considers  certain  structural  issues  associated  with  biorthogonal  maximally  decimated 
nonuniform  filter  banks,  with  emphasis  on  studying  their  relationship  to  tree  structured  filter  banks. 
The  class  of  filter  banks  under  consideration  is  depicted  in  fig.  1,  with  Hi(z )  and  Ft(z)  rational.  Maximal 
decimation  refers  to  the  condition  that 


K 


E 


i 


Ui 


1. 


(1.1) 


Analysis  bank  Synthesis  bank 

Figure  1:  A  nonuniform  filter  bank. 

The  arrangement  to  the  left  of  the  filter  bank  consisting  of  the  n,-fold  decimators  and  the  analysis 
filters  Hj(z )  is  the  analysis  bank,  and  that  to  the  right  with  the  ra,;-fold  interpolators  with  Ft(z)  as  the 
synthesis  filters  is  the  synthesis  bank.  This  filter  bank  has  the  perfect  reconstruction  (PR)  property  if  its 
output  always  equals  its  input,  x(n )  =  x{n).  It  is  known,  [6],  that  the  analysis  and  synthesis  filters  of  a 
maximally  decimated  filter  bank  with  the  PR  property  form  a  biorthogonal  system.  We  will  henceforth 
use  the  terms  biorthogonality  and  PR  interchangeably. 

Of  particular  interest  are  dyadic  filter  banks  as  depicted  in  fig.  2,  a  special  case  of  nonuniform  filter 
banks  wherein  K  —  M  —  1,  rq  =  2*+1  for  0  <  i  <  M  —  2  and  nM-i  —  2M_1.  The  relations  between 
wavelets  and  such  multirate  filter  banks  have  been  known  for  some  time:  [19],  [11],  [4],  [20]  connect  filter 
banks  to  wavelet  bases.  In  particular  the  dyadic  filter  bank  generates  Discrete  Time  Wavelet  Transform 
(DTWT)  bases,  [15],  while  the  more  general  structure  of  fig.  1  generates  wavelet  packet  bases,  [11],  Also 
important  are  dyadic  tree  structured  filter  banks  (TSFBs).  Fig.  3  and  4  respectively  depict  an  (L+  1)- 
level  dyadic  tree  structured  analysis  bank  (TSAB)  and  the  corresponding  tree  structured  synthesis  bank 
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(TSSB).  These  typically  capture  DTWT  bases,  [15]. 


Figure  2:  An  M-channel  dyadic  filter  bank. 


Level  1  Level  2  Level  L+l 


Figure  3:  An  (L  +  l)-level  dyadic  tree  structured  analysis  bank. 

The  study  of  biorthogonal  filter  banks  and  tree  structures  is  also  motivated  by  their  applications 
in  image  coding  and  compression.  Most  transforms  used  in  signal  processing  are  orthonormal,  having 
the  energy  preservation  property.  Orthonormality  is  however  not  compatible  with  phase  linearity  in  the 
case  of  FIR  (finite  impulse  response)  filter  banks  [4],  [19].  Biorthogonality  on  the  other  hand  allows 
additional  freedom  to  have  arbitrary  length  linear  phase  filters.  Consequently,  in  the  last  few  years, 
biorthogonal  transforms  in  general,  and  biorthogonal  wavelet  transforms  arising  from  tree  structures 
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Level  L+l  Level  2  Level  1 

Figure  4:  An  ( L  +  l)-level  dyadic  tree  structured  synthesis  bank. 


in  particular,  are  gaining  increasing  currency  in  image  coding  applications,  where  phase  linearity  is 
desirable.  Biorthogonal  wavelets  have  also  been  used  in  the  design  of  wavelet  filters  with  binary  coeffi¬ 
cients  [4],  [12].  Other  applications  can  be  found  in  the  areas  of  image  compression  [5],  [13],  fingerprint 
compression  [10],  and  image  coding  [3],  [9].  Given  the  importance  of  biorthogonal  wavelets  and  filter 
banks  in  these  wide-ranging  applications,  it  becomes  natural  to  investigate  connections  between  filter 
banks  and  tree  structures  under  the  bi orthogonality  constraint. 

The  results  presented  in  this  paper  extend  two  results  by  Soman  and  Vaidyanathan,  [11].  Specifically, 
[11]  shows  that  (i)  every  orthonormal  dyadic  filter  bank  is  equivalent  to  a  tree  structured  filter  bank; 
(ii)  while  this  is  not  true  for  more  general  nonuniform  filter  banks,  a  general  tree  structured  filter  bank 
is  orthonormal  iff  its  each  constituent  level  is  orthonormal. 

These  results  are  extended  in  this  work  to  the  biorthogonal  case.  We  show  first  that  a  tree  structured 
filter  bank  is  biorthogonal  iff  it  is  equivalent  to  a  tree  structured  filter  bank  whose  matching  consituent 
levels  on  the  analysis  and  synthesis  sides  are  themselves  biorthogonal  pairs.  We  then  show  that  a 
stronger  statement  can  be  made  about  dyadic  filter  banks  in  general:  That  a  dyadic  filter  bank  as  in 
fig.  2  is  biorthogonal  iff  both  the  analysis  bank  and  the  synthesis  bank  can  be  decomposed  into  dyadic 
trees  as  in  fi,g.  3  and  4  with  each  pair  of  matching  levels  on  the  analysis  and  synthesis  sides  forming 
biorthogonal  pairs.  It  is  instructive  to  note  that  even  if  the  analysis  bank  in  fig.  2,  taken  as  an  operator, 
is  left  invertible,  there  may  not  exist  a  synthesis  bank  of  the  form  in  fig.  2  such  that  the  overall  filter  bank 
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is  biorthogonal.  Effectively  our  result  shows  that  only  in  such  a  case  is  a  tree  structured  decomposition 
impossible.  We  note  that  using  different  proof  techniques,  recently  Akkarakaran,  [21],  has  independently 
derived  certain  results  on  nonuniform  filter  banks  that  can  be  specialized  to  the  results  given  here. 

In  Section  2,  some  preliminary  definitions  are  given.  Basic  results  on  extended  polyphase  matrices 
and  their  invertibility  are  developed  in  Section  3.  Results  related  to  the  decomp osibility  of  biorthogonal 
nonuniform  filter  banks  to  tree  structured  filter  banks  are  explored  in  Sections  4  and  5.  Section  4 
considers  general  biorthogonal  tree  structured  filter  banks,  and  Section  5  specializes  to  the  dyadic  case. 
Section  6  then  discusses  certain  stability  and  FIR  issues  related  to  these  decompositions.  Section  7 
presents  conclusions. 


2  Preliminaries 


The  general  nonuniform  filter  bank  structure  of  fig.  1  is  closely  connected  to  wavelet  packet  bases. 
Specifically,  with  /.,  (n)  and  hj\n),  the  impulse  responses  of  the  synthesis  and  analysis  filters  respectively, 
one  has  that 

K 


*(")  =  ££  yk{m)fk{n  ~  nkm) 

k=  0  m 


(2.2) 


and 


yk(n)  =  Y  hk(m)x(nkn  -  m)  (2.3) 

m 

The  functions  fk(n—nkm )  constitute  wavelet  packet  bases,  while  yk(n)  are  the  corresponding  coefficients. 
We  will  call  the  FB  in  fig.  1  biorthogonal ,  if  for  all  x(n ), 


x(n )  =  x(n )  Vn  (2.4) 

It  can  be  readily  shown  from  a  variation  of  arguments  in  [6],  [15],  that  this  is  equivalent  to  the  require¬ 
ment  that,  with  <5(.)  the  Kronecker  delta, 

Y  /fc(n  “  nki)hm(-n  +  nml)  =  5 (m  -  k)5(l  -  i )  (2.5) 

n 

This  contrasts  with  the  orthonormality  property  that  requires 

Y  fk{n  ~  nki)fm(n  ~  nml )  =  5(m  -  k)5(l  -  i)  (2.6) 

n 

and 

hk(n)  =  fk(—n)  (2.7) 
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The  results  presented  in  the  paper  are  derived  using  polyphase  representations.  We  now  present  the 
idea  of  extended  polyphase  matrices.  Assuming  that  the  filters  Hk(z)  and  Fk(z)  have  rational  transfer 
functions,  in  the  sequel,  we  will  call  Eki(z )  and  Rik{z )  the  Z-th  n^.-fold  type-I  and  type-II  polyphase 
components  of  Hk{z)  and  Fk(z )  respectively,  if  one  has 


«*- 1 


E  *~lEki(znk), 

1=0 

(2.8) 

1 

E  zlRlk(znk). 

(2.9) 

1=0 


To  obtain  a  matrix  polyphase  representation  of  fig.  1,  one  must  use  an  additional  device  from  [16]:  The 
idea  is  to  redraw  the  nonuniform  filter  bank  in  fig.  1  as  an  equivalent  uniform  maximally  decimated 
system.  Define  N  to  be  the  least  common  multiple  (l.c.m.)  of  the  Then  observe  that  fig.  5-(a)  is 
equivalent  to  the  /j /.-channel  filter  bank  in  fig.  5-(b)  with 


(2.10) 


Now  define  the  pk  x  N  matrix  Ek{z )  whose  ij- th  element  is  the  j-th  IV-fold  type-I  polyphase 
component  of  z~tnkF[fc(z).  Similarly  define  the  N  x  pk  matrix  Rk{z )  whose  ji- th  element  is  the  j-th 
IV-fold  type-II  polyphase  component  of  zinkFfc(z).  Henceforth  we  will  call  Ek(z )  and  Rk(z)  the  IV-fold 
extended  polyphase  representation  of  Hk{z)  and  Fk(z)  respectively.  It  is  readily  seen  that  in  fact  fig. 
5-(b)  is  equivalent  to  fig.  5-(c).  Further  vki(n),  the  output  of  the  downsamplers  in  fig.  5-(b),  are  simply 
certain  samples  of  yk{n),  i.e.  Vki(n)  —  ykipkn  +  *)■  Then  the  N  x  N  matrices 

lT 


and 


E(z)  = 


R(z)  = 


EfAz )  ■■■  £o  (z) 


Rk(z ) 


Ro(z) 


(2.H) 


(2.12) 


are  the  IV-fold  extended  polyphase  representations  of  the  analysis  and  synthesis  banks  of  fig.  1  respec- 

,  T 

will  be 


tively.  Further  for  any  i i,  *2?  a  subset  of  {0, 1, . . . ,  K}, 


EUz)  Efjz) 


EjAz) 


Hn(z)  Hl2(z ) 


Under  these 


the  IV-fold  extended  type-I  polyphase  representation  of 
conditions,  the  overall  filter  bank  in  fig.  1  has  the  /equivalent  representation  in  fig.  7. 

Thus,  for  the  3-channel  filter  bank  of  fig.  6,  one  has  the  equivalent  representation  of  fig.  7,  with 
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Figure  6:  A  3-channel  dyadic  filter  bank. 

N  =  4. 

Here,  with  Eki(z),  the  4-fold  type-I  polyphase  components,  one  has  E{z)  given  by 

^20  (z)  E-niz)  E-22(z)  E23(z) 

Eio(z)  En(z)  Ei2(z)  Eis(z) 

Eoo (z)  Eoi(z)  E02(z)  Eo3(z) 

z~1E02(z)  z~1E03(z)  E00(z)  Eoi(z) 

Likewise  with  Rik(z),  the  4-fold  type-II  components  of  Ffc(z),  one  has 

Ro2(z)  Roi(z)  Roo(z)  zR2o{z ) 

Ri2(z)  Rn{z)  Rio(z)  zR30{z ) 

R22{z)  R2i(z)  R2o(z)  Roo(z) 

R32(z )  R3i{z)  R30(z )  Rio(z) 

Note  the  structural  constraints  on  R(z)  and  E(z).  We  will  call  the  analysis  (respectively,  synthesis) 
bank  left  (respectively,  right)  invertible  if  there  is  a  1  x  (K  +  1)  (respectively,  ( K  +  1)  x  1)  operator  C 
such  that  the  arrangement  in  fig.  8- (a)  (respectively,  fig.  8-(b))  is  identity.  Here  onwards  we  will  drop 
the  qualifiers  left  and  right,  that  is,  the  invertibility  of  an  analysis  bank  will  automatically  refer  to  its 
left  invertibility,  and  that  of  a  synthesis  bank  to  its  right  invertibility. 

Also,  observe  that  invertibility  of  the  analysis  bank  necessitates  the  nonsingularity  of  E(z)  (that 
is,  det(£'(z))  is  not  the  zero  function):  To  see  this,  suppose  in  fig.  8- ( a) ,  the  arrangement  rep¬ 
resents  an  identity  operator.  Observe  that  the  input  to  E(z),  see  fig.  7,  is  the  blocked  vector 

1 T 

x(nN ),  x(nN  —  1), ... ,  x(nN  —  N  +  1)  ■  Then  since  this  vector  can  be  constructed  by  a  linear 

operation  from  x(n ),  and  the  outputs  of  E(z)  are  simply  certain  rearranged  outputs  of  samples  of  the 


(2.14) 


(2.13) 
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Figure  7:  Polyphase  representation  of  filter  bank  in  fig.  1. 


(a) 


(b) 


Figure  8:  Illustration  of  invertibility. 


analysis  bank  in  fig.  1,  there  is  an  N  x  N  operator  C  such  that  CE{z)  —  I.  Since  E{z)  is  a  square 
matrix,  it  must  be  nonsingular.  A  similar  reasoning  proves  that  the  invertibility  of  a  synthesis  bank 
requires  the  nonsingularity  of  R(z).  Further,  it  follows  that  the  filter  bank  in  fig.  1  is  biorthogonal  iff 
R(z)E(z)  =  I. 

Notice  that,  even  if  E(z)  is  invertible,  E~1(z)  may  not  have  the  structure  that  the  polyphase  matrix 
of  the  synthesis  bank  must  have.  Thus  for  example,  consider  an  analysis  bank  with  polyphase  matrix 


E(z) 


1  +  2z~l  1  0  0 

1  0  l+z-1  2 

1  1  z-1  2 

z~2  2  z-1  1  1 
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The  inverse  is  then 

3z  +  z2  —  z  +  2s2  z  —  z2  —2  z2 

!  1  -2  +  z2  2-3z-2z2  -2  +  z  +  z2  4z  +  2z2 

2z2  +  5z  +  4  -2  +  Z2  6  +  2z  -6  +  4z  +  4z  +  2s2 

z_I  +1  —  2  z  —  z2  —3  z~l  —2  +  2  z  3^_1  +5  +  2  z  +  z2  —2  —  3z 

Observe  that  E~1(z)  fails  to  obey  the  structure  in  (2.14).  Thus  even  if  E(z)  is  invertible,  there  may 
not  be  a  synthesis  bank  as  in  fig.  1  that  renders  the  filter  bank  in  fig.  1  biorthogonal.  Henceforth 
should  the  analysis  bank  (respectively,  synthesis  bank)  in  fig.  1  be  such  that  there  exists  a  synthesis 
bank  (respectively,  analysis  bank)  of  the  form  in  fig.  1  for  which  the  filter  bank  is  biorthogonal,  then 
we  will  call  the  analysis  bank  (respectively,  synthesis  bank)  conformally  invertible. 

3  Some  results  on  extended  polyphase  matrices 

We  now  present  some  preliminary  results  on  polyphase  matrices,  to  be  used  in  later  sections.  Al¬ 
though  these  results  are  stated  for  analysis  banks,  they  trivially  extend  to  synthesis  banks  as  well. 

We  will  say  that  a  matrix  P(z)  has  linearly  dependent  rows  if  there  exists  a  rational  vector  q(z)  0 
such  that 

qT(z)P(z )  =  0. 

Lemma  3.1  Consider  fig.  9(a)  and  9(b)  with  L  —  lcm(n\,  ri2,  ■  ■  Pi  —  h/n(,  and  all  filters 

rational.  The  L-fold  extended  polyphase  matrix  of  the  AB  in  fig.  9(a)  has  linearly  dependent  rows  iff 
the  system  in  fig.  9(b)  is  identically  zero  for  some  transfer  functions  Ofiz)  not  all  zero. 

Proof:  Using  the  /+-fold  Type-I  representation,  6i(z)  —  Y^'=o  z~^ij(zPi)  and  making  use  of  the 

Noble  identities,  fig.  9(b)  can  be  redrawn  as  shown  in  fig.  9(c).  Also  the  polyphase  matrix  of  the 
arrangement  to  the  left  of  the  Oij(z)  in  fig.  9(c)  is  simply  the  L-fold  extended  polyphase  matrix  E(z) 
of  the  AB  in  fig.  9(a).  Redrawing  the  nonuniform  AB  as  a  uniform  AB  in  polyphase  form  as  explained 
in  the  earlier  section,  E{z)  has  linearly  dependent  rows  iff  0\  q(z)  . . .  0m  nM-l(z)  ^(z)  —  0,  with 
not  all  0i(z)  —  0. 

■ 

Lemma  3.2  Consider  fig.  9(a)  and  10(a),  with  L  —  lcin(ri\.  n->.  ■  ■  +\  n.v; )  and  all  filters  rational.  If 
the  L-fold  extended  polyphase  matrix  of  the  AB  in  fig.  9(a)  has  linearly  dependent  rows,  so  does  the 
NK L-fold  extended  polyphase  matrix  of  the  AB  in  fig.  10(a). 
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L 


J 


L 


J 


(c) 


Figure  9:  Setup  and  illustration  for  lemma  3.1. 
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(a) 


(b) 

Figure  10:  Setup  and  illustration  for  lemma  3.2. 


Proof:  By  lemma  3.1,  there  exist  9j(z),  not  all  zero,  such  that  the  system  in  fig.  9(b)  is  identically 
zero.  We  need  to  show  that  there  exist  A,(z),  not  all  identically  zero,  such  that  the  system  in  fig.  10b  is 
identically  zero.  Consider  a  system  wherein  the  system  in  fig.  9(b)  is  preceded  by  the  system  consisting 
of  G(z)  and  the  decimator  IV,  and  the  output  of  fig.  9(b)  is  appended  by  a  decimator  K.  This  new 
system  is  also  identically  zero.  Now  choosing  Ai(z)  =  1  and  A,-(z)  =  0i{z)  for  i  >  1  makes  this  new 
system  identical  to  fig.  10b  provided  6\(z)  —  F(z).  Indeed  the  6i(z)  can  be  so  scaled  that  #i(z)  =  F(z) 
provided  that  F(z).  0t(z)  ^  0,  completing  the  proof.  The  case  F(z)  —  0  is  trivial,  and  for  the  case 
6\(z)  —  0  we  only  have  to  make  the  choice  Ai(z)  =  0  instead  of  Ai(z)  =  1.  ■ 

4  Biorthogonal  Tree  Structured  Filter  Banks 

The  Introduction  had  given  the  example  of  a  dyadic  tree  structured  filter  bank.  In  this  section  we  deal 
with  more  general  tree  structured  filter  banks,  and  discuss  conditions  under  which  such  a  filter  bank  is 
biorthogonal.  Some  notation  for  general  tree  structured  filter  banks  will  be  introduced  first. 

A  general  TSAB  is  depicted  in  fig.  11(a).  Through  the  repeated  use  of  Noble  identities,  any  such 
filter  bank  is  equivalent  to  the  analysis  bank  in  fig.  1.  For  example,  the  analysis  bank  in  fig.  11(a)  is 
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equivalent  to  that  in  fig.  11(b). 


(a)  (b) 

Figure  11:  An  example  tree  (analysis  side). 


Whereas  in  a  dyadic  tree,  only  one  of  the  two  branches  on  a  given  level  divides  further,  this  is  not  in 
general  the  case  with  arbitrary  trees.  Further,  in  a  general  tree,  two  different  levels  may  have  different 
number  of  branches.  In  the  dyadic  case,  each  level  has  precisely  two  branches. 

In  a  general  TSAB,  filters  that  have  the  same  input  will  be  said  to  belong  to  the  same  level.  Thus,  in 
fig.  11(a),  the  filter  sets  {Ao,  Ai},  {Dq.Di.D-j},  {Bo,Bi},  and  {Co,Ci,C2}  each  contribute  a  separate 
level.  We  will  call  a  level  an  output  level ,  if  its  outputs  do  not  branch  out  further.  For  example, 
{Ao,  A\],  {Do,  Bi,  D2},  {Bo,  B\]  are  all  at  output  levels.  We  will  say  a  TSSB  matches  a  TSAB  if  it  is 
topologically  a  mirror  image  of  the  TSAB.  For  example,  the  TSSB  in  fig.  12  matches  that  in  fig.  11(a). 
Of  course  the  filters  appearing  in  a  matching  TSSB  may  differ  from  their  counterparts  in  the  TSAB  in 
question. 

Further  we  will  designate  the  levels  constituting  {Bn -Pi},  {Qo?Ql};  {Bo,  S\,  S^},  { T(j •  Ti ,  B }  as 
respectively  the  matching  levels  of  {Ao,  Ai},  {Bo,  Fi},  {Co,  C\ .  C'2  } .  {Do,  Di,  D 2}.  We  will  call  the  tree 
structured  filter  bank  critically  sampled,  if  its  equivalent  in  fig.  1  is  also  critically  sampled. 

Throughout,  the  following  assumption  applies: 

Assumption  4.1  Each  level  in  the  TSAB  is  a  critically  sampled  uniform  analysis  bank  with  all  filters 
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Figure  12:  Matching  tree  structured  synthesis  bank. 


rational. 

In  a  dyadic  tree,  each  level  is  a  2-channel  critically  sampled  uniform  filter  bank.  Observe  Assumption 
4.1  guarantees  that  the  TSAB  is  critically  sampled. 

Throughout  this  section  the  results  presented  are  for  TSABs.  Proof  of  extensions  to  the  TSSBs, 
being  trivially  similar,  are  omitted.  We  now  state  and  prove  the  main  result  of  this  section. 

Theorem  4.1  A  TSAB  satisfying  Assumption  f.l  is  invertible  iff  each  of  its  constituent  levels  is  in¬ 
vertible.  Under  this  condition  the  TSAB  has  an  inverse  that  is  a  matching  TSSB,  with  each  matching 
level  the  inverse  of  its  counterpart  in  the  TSAB. 

Proof:  The  if  part  is  trivial.  The  only  if  part  is  proved  by  contradiction.  If  any  of  the  levels  is 

not  invertible,  its  polyphase  matrix  has  linearly  dependent  rows.  Hence  by  lemma  3.2  the  extended 
polyphase  matrix  of  the  TSAB  also  has  linearly  dependent  rows,  a  contradiction.  ■ 

Thus  if  a  TSAB  is  invertible,  then  not  only  is  it  conformally  invertible,  but  in  fact  its  left  inverse  is  a 
matching  TSSB.  This  thus  generalizes  the  comparable  result  in  [11]  derived  for  orthonormal  trees. 

The  question  remains:  Suppose  a  conformally  invertible,  nonuniform  analysis  bank  has  a  set  of 
decimation  ratios  that  are  compatible  with  a  tree  structure.  Is  it  then  necessarily  decomposable  into 
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(a)  (b) 

Figure  13:  Synthesis  bank  which  cannot  be  decomposed  into  a  tree  structure. 


a  TSAB?  The  answer  in  general  is  no.  To  see  this,  consider  the  example,  [11],  of  a  TSSB  shown  in 
fig.  12.  This  can  be  redrawn  using  noble  identities  as  in  fig.  13(a).  Suppose  this  synthesis  side  is 
conformally  invertible.  Now  consider  the  synthesis  bank  in  fig.  13(b),  with  Lq{z)  —  { Pi{z3)Sq{z )  + 
Qo(z3)S2(z))/\/2,  Li(z)  —  (. Pi(z3)So(z )  —  Qq(z3)S-2(z))/ \/2.  Clearly  fig.  13(a)  and  fig.  13(b)  have  the 
same  MISO  (multiple  input  single  output)  relationship.  However,  it  is  not  possible  to  decompose  fig. 
13(b)  into  a  tree  structure.  The  reason  is  as  follows.  For  this  synthesis  bank  to  be  represented  as  a 
tree,  Lq{z )  (and  Li{z))  must  be  expressible  in  the  form  P^(23)So(z)  or  Q'0(z3)S2(z)  (see  fig.  12  and  fig. 
13(a)).  Neither  is  possible  unless  Sq(z)  —  £2(2).  However,  since  the  synthesis  bank  at  each  level  of  the 
tree  in  fig.  12  is  invertible,  Sq(z)  ^  S-2(z).  Hence  this  synthesis  bank  cannot  be  decomposed  into  a  tree 
structure.  In  the  next  section  we  show  that  for  dyadic  nonuniform  analysis  banks,  this  result  does  hold. 

5  Dyadic  filter  banks  and  tree  structures 

The  previous  section  showed  that  every  invertible  TSAB  admits  an  inverse  that  is  a  matching  TSSB. 
We  now  turn  to  the  special  case  of  dyadic  filter  banks  where  a  stronger  result  is  possible. 
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Recall  that  a  general  dyadic  tree  structured  filter  bank  is  equivalent  to  a  dyadic  nonuniform  filter 
bank  of  the  form  in  fig.  2.  In  this  section  we  show  that  a  dyadic  nonuniform  analysis  bank  (respectively, 
synthesis  bank)  of  the  form  in  fig.  2  is  conformally  invertible  iff  it  admits  a  tree  structured  decomposition. 
In  view  of  the  results  of  the  previous  section  this  therefore  also  shows  that  every  conformally  invertible 
dyadic  analysis  bank  has  a  tree  structured  inverse. 


(a) 


(b) 


Figure  14:  Tree  decomposition  of  a  dyadic  filter  bank  obeying  (5.15)  and  (5.16). 


Consider  now  two  channels  of  a  dyadic  nonuniform  filter  bank  depicted  in  fig.  14,  with  Gi(z),  Ft(z) 


all  rational.  Then  from  the  noble  identities  it  is  evident  that  this  nonuniform  filter  bank  is  decomposable 


as  in  fig.  14(b)  if  and  only  if 


and 


This  in  turn  requires  that 


and 


Gi(z)  =  S(z)Si(z2N) 

(5.15) 

Fd.z)  =  Q(z)Qdz2N). 

(5.16) 

GrU)  Si(z2N) 

G2(z)  S2(z2N) 

(5.17) 

Fi(z)  QiU2") 

F2(z)  Q2(z2Ny 

(5.18) 

Put  differently,  the  decomposability  of  fig.  14(a)  into  fig.  14(b)  is  equivalent  to  the  requirement  that 
for  all  0  <  k  <  2N  -  1, 


~i2Jk 

where  (3k  —  e  2N 


Gi{z)  GiCSjbz)  Fiiz)  Fiifaz) 

G2(z)  G2((3kz)  an  F2(z)  F2((3kz) 


Lemma  5.1  describes  a  setting  in  which  this  is  necessary. 


(5.19) 
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Figure  15:  An  equivalent  structure  to  fig.  14(a). 


Lemma  5.1  Consider  the  structure  in  fig.  14(a)  with  Gfiz),  Ffiz)  all  non-zero.  Suppose  for  some 
G(z),  F(z),  fi.g.  14(a)  is  equivalent  to  fig.  15.  Then  (5.19)  holds. 


Proof:  Define  —  e  Jl  r  . 


Gi(z)  GfizW2^)  ...  GfizW^tr2) 


Qe(z)  = 


G2(z )  G2(zW2N+1)  ...  G2(zW. ^i1-2) 


G1(zW2n+1)  Gi(zW^n+1)  ...  GfizW^tr1) 


[G2(zW2N+ 0  G2(zW*N+1 )  ...  GfizW^tr1) 
Xe{z)=^x{z)  X(zW*N+1)  ...  X(zW^r2)  ] 
x0(z)=  x(zw2n+i)  x(zw*N+1 )  ...  x(z\v2*:;:  ') 

G(z)=  G(z)  G(zW2N )  G(zW2n)  ...  GizW^-1) 


(5.20) 


(5.21) 


(5.22) 


(5.23) 


(5.24) 


W$+1  -  W.fN. 


Thus,  in  fig.  14(a) 


Y(z)  —  ^37+T  Fi(z)  F'2  (z)  {Ge(z)Xe(z)  +  g0(z)X0(z)}. 


Because  of  (5.25),  in  fig.  15 


Y(z)  =  -wF(z)Qe{z)Xe(z). 


(5.25) 


(5.26) 


(5.27) 


Thus  for  all  combinations  of  Xe (z).  XN(z).  the  right  hand  sides  of  (5.26)  and  (5.27)  are  equal.  Conse¬ 


quently  for  all 


Fi(z)  F2(z)  Qo(z)=  0. 


Thus,  from  (5.21),  for  all  0  <  A  <  2A  —  1, 


Ffiz)  G2(zW™+1) 
F2(z)  GdzW^rl)' 


(5.28) 


(5.29) 
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(5.30) 


(5.31) 


Note  that  =  e  WTT  -  W*nW2n+ i.  Thus,  for  all  0  <  k  <  2N  -  1* 

F\(z)  =  G2(az(3k) 

F-2{z)  G1(azf3k) 

where  a  =  W-2n+i.  Consequently,  choosing  z  —  az,  for  all  !</,:<  2A  —  1 

Gi(z)  G2(z(3k ) 
g2(z)  G.m) 

that  is,  the  first  equality  in  (5.19)  holds. 

Further,  since  for  every  0  <  k,l  <  2A  —  1,  there  exists  0  <  m  <  2A  —  1,  for  which  ffifif  —  flrn ,  the 
second  equality  in  (5.19)  also  holds.  ■ 

The  next  lemma  shows  that,  in  fact,  in  the  decomposition  in  fig.  14(b),  the  x(n )  to  y(n)  relation 
can  be  chosen  to  be  biorthogonal. 

Lemma  5.2  Consider  the  structure  in  fig.  14(b)  with  all  filters  non-zero  rationals.  Suppose  the  system 
from  x(n)  to  y(n )  is  non-zero  and  is  equivalent  to  the  structure  in  fig.  15.  Then  one  can  select  the 
filters  in  such  a  way  that 

(i)  the  x(n)  to  y(n)  relation  is  the  identity  system,  and 


(ii)  the  x(n)  to 


i  T 


x,\(n)  x2  (n) 


and 


x\ (n)  x2(n) 


to  y(n)  relationship  is  preserved. 


Proof:  We  first  argue  that  the  uniform  filter  bank  relating  x{n)  to  y(n)  in  fig.  14(b)  is  alias  free. 

From  [15],  this  is  guaranteed  if 


Qi{z)Si{-z)  +  Q2{z)S2(-z)  =  0 
Qi(z) 


45 


S2(~z) 


Si(-z)  +  Q2(z) 


S2(—z)  Qi(z) 


=  0. 


(5.32) 


From  (5.17),  (5.18) 


Thus  (5.32)  is  equivalent  to 


Si{z)  G^zW)  Qi(z)  FfizW) 
S2(z)  G2{z5F)  ’  Q-liz)  F2(zFt) 

1  _  jn  1 

Gi{ziN  e  2W)  F2(zW) 


+ 


=  0. 


G2(zW e  Fi(zJ7T) 

jir 

This  clearly  holds  from  (5.30)  with  k  —  0,  and  from  the  fact  that  a  —  e  . 

Thus,  the  x(n)  to  y(n)  relationship  is  LTI  with  transfer  function  T(z).  Clearly  T(z)  0,  as  otherwise 
the  x{n)  to  y{n)  relation  will  be  zero.  Then  replacing  Qi(z)  by  T~1(z)Qi(z)  and  Q(z)  by  T(z'2N )Q{z) 
yields  the  result.  ■ 
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Figure  16:  An  (N  +  2)-channel  filter  bank. 


Figure  17:  Modified  dyadic  filter  bank. 

The  next  Lemma  (see  [2]  for  details)  relates  the  2A  -fold  extended  polyphase  matrices  of  a  filter  bank 
to  its  2JV+1-fold  extended  polyphase  matrices. 

Lemma  5.3  Consider  the  last  N  channels  of  the  dyadic  filter  hank  in  fig.  16  with  all  filters  rational. 
Let  E(z )  ( respectively ,  R(z) )  he  the  2A  -fold  extended  type- 1  (respectively,  type-II)  polyphase  matrices  of 
the  AB  (respectively,  SB),  and  E{z)  (respectively,  R(z))  their  extended  2A  +1  -fold  counterparts.  Suppose 

E(z)  —  Eq(z2)  +  z~1Ei(z2)  and  R{z)  —  Rq(z2)  +  zRi(z2). 


Then 


E(z ) 


E0{z )  Ei(z) 
z~1Ei{z)  Eo(z) 


and  R(z)  — 


Ro(z)  zRi(z) 
Rl(z)  Rq(z) 


(5.33) 


We  need  one  more  preparatory  lemma. 


21 


Lemma  5.4  Consider  the  dyadic  filter  bank  in  fig.  16,  with  all  filters  rational  and  N  >  1.  Suppose  this 
filter  bank  is  biorthogonal.  Then  there  exist  rational  iCy  (z)  and  F\'(z)  such  that  the  filter  bank  in  fi.g. 
1 7  is  biorthogonal. 

Proof:  Call  E(z )  (respectively,  R(z)),  the  2;V+ 1  -fold  extended  type-I  (respectively,  type-II)  polyphase 
matrix  of  the  analysis  bank  (respectively,  synthesis  bank)  in  fig.  16.  Further  call  E(z)  (respectively, 
R{z)),  the  2 ;V+ 1  -fold  extended  type-I  (respectively,  type-II)  polyphase  matrix  of  the  analysis  bank 
(respectively,  synthesis  bank)  obtained  by  removing  the  upper  two  channels  in  fig.  16,  and  E(z)  (re¬ 
spectively,  R(z ))  the  2 ;V -fold  extended  polyphase  matrix  of  this  IV-channel  AB  (respectively,  SB).  Then 


R(z)E(z)  =  I  44  E(z)R(z )  =  I. 


Since  for  some  E(z),  R(z ), 


E(z) 


one  has 


E(z) 

,  R(z )  = 

R(z)  R(z) 

E(z) 

E(z)R(z)  =  I. 


With  Efiz),  Rfiz)  as  in  lemma  5.3,  by  lemma  5.3 


E0(z)  Ei(z) 

Ro(z)  zRi(z) 

z~1E1(z)  Eo(z) 

Ri (z)  Ro(z) 

Thus 


E0(z)R0(z)  +  E1(z)R1(z)  =  I, 
zE0(z)Ri(z)  +  Ei(z)R0(z)  =  0. 


Thus 


E(z)R(z)  =  [e0(z2)  +  z-'E^z2)]  [Ro(z2)  +  zRfiz2)] 

=  Eo(z2)Ro(z2)  +  Efiz^Rfiz2)  +  z-^Efiz^Roiz2)  +  zE^z^Rfiz2) 


Note  E(z)  is  (2n 
p-2{z )  such  that 


=  I. 

1)  x  2 A  and  R(z)  is  2N  x  (: 2N  —  1).  Clearly  there  exist  rational  2iV-vectors  pi(z), 


pH?) 

E(z) 


p2(z)  R(z) 


=  I. 
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Hence  the  result. 


The  result  goes  beyond  the  results  of  Section  4  in  the  following  respect.  It  shows  that  if  an  N  +  1- 
channel  dyadic  analysis  bank  is  conformally  invertible,  then  one  can  augment  its  channels  indexed  from 
0  to  N  —  1  in  a  manner  depicted  in  fig.  17  in  a  way  that  the  resulting  iV-channel  dyadic  analysis  bank 
is  not  only  conformally  invertible,  but  that  its  inverse’s  channels  indexed  from  0  to  IV  —  1  coincide  with 
the  corresponding  channels  of  the  inverse  of  the  N  +  1-channel  dyadic  analysis  bank. 

We  can  now  prove  the  main  result  of  this  section. 

Theorem  5.1  Consider  the  dyadic  nonuniform  filter  hank  in  fig.  2  with  M  >  2  and  all  filters  rational. 
Suppose  the  filter  bank  is  biorthogonal.  Then  the  analysis  bank  and  the  synthesis  bank  are  equivalent  to 
a  TSAB  and  a  TSSB  respectively. 

Proof:  Suppose  the  result  holds  for  some  N  —  M  —  1  >  2.  Consider  the  filter  bank  in  fig.  16  and 
assume  it  is  biorthogonal.  Then  from  Lemma  5.4,  there  exists  a  filter  bank  of  the  form  in  fig.  17  that  is 
biorthogonal  and  has  all  filters  rational.  Consequently  channels  N  +  1  and  IV  +  2  in  fig.  16  are  together 
equivalent  to  the  top  channel  in  fig.  17.  Thus  from  Lemmas  5.2  and  5.3  the  top  two  channels  in  fig.  16 
are  equivalent  to  a  structure  as  in  fig.  14(b)  with  the  2-channel  uniform  filter  bank  relating  x(n)  to  y{n), 
biorthogonal.  Thus  the  filter  bank  in  fig.  17  with  S(z )  =  Hjy(z)  and  Q(z )  =  Fn(z )  is  biorthogonal. 
Then  a  simple  inductive  argument  proves  the  result.  ■ 

Taken  together  with  the  results  of  the  previous  section,  we  have  the  following  result  for  dyadic 
analysis  banks. 

Theorem  5.2  A  critically  sampled  dyadic  analysis  bank  with  rational  filters,  is  conformally  invertible $ 
iff  both  the  following  conditions  hold: 

(i)  The  analysis  bank  can  be  decomposed  into  a  dyadic  TSAB. 

(ii)  The  two  channel  filter  banks  appearing  at  every  level  of  this  dyadic  tree  are  themselves  invertible. 

Further,  the  inverse  is  also  decomposable  into  a  dyadic  TSSB.  The  levels  of  the  TSSB  can  be  chosen 
so  that  they  form  biorthogonal  pairs  with  their  matching  levels  in  the  TSAB. 

Note  also  that  this  provides  a  simple  test  for  conformal  invertibility:  check  if  the  analysis  bank  or 
synthesis  bank  is  decomposable  into  a  dyadic  tree  structure  with  invertibility  of  each  level.  Decompos- 
ability  of  course  requires  testing  if  for  all  i  >  1,  the  ratios  Hfiz) / are  rational  functions  in  z2  , 
and  is  simple  to  verify. 
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Indeed  turn  to  the  example  at  the  end  of  Section  2,  of  a  dyadic  analysis  bank  that  though  left 
invertible,  was  not  conformally  so.  In  that  case  Hi(z)  —  1  +  z~2  +  2z~3  +  z~6  and  H2{z )  =  l  +  z~4  +  2z~4. 
Clearly,  one  cannot  express  H\{z) / H-2(z)  as  G{z2),  with  G(z )  rational.  Hence  this  analysis  bank  is  not 
decomposable  into  a  tree  structure. 

On  the  other  hand  consider  the  example  below. 


Figure  18:  A  two  level  tree  structured  filter  bank. 


Example  5.1  Consider  the  set  of  analysis  filters  Hq(z )  =  (z6  —  z4  —  z2  —  z+l  +  z  1)/(z12  —  2z8  + 
1  ),H1(z)  -  (z6+z4-z-z-1)/(z16-zs-2z4-1),H2(z)  -  (-z8+z6+z3+z2-z-z-3)/(z16-z8-2z4-l). 
The  corresponding  polyphase  analysis  matrix  is  then 

I" _ z+l  z(z+l)  _ £ _ _ J-  1 

z4-2z-z2-l  z*—2z—z2—l  z4—2z—z2—l  z*-2z-z3-l 


E(z) 


z  z 

z4-2z-z2-l  z4—2z—.:-  —  l 

1  1 
-1+.;^-.;  zs  —2z2  +  l 

1  1 

^-23^+1 


1 

z4—2z—z2  —  l 


z 

—  1 +z'2  —  z 


1 

-i+z2  -f-i 


_ : 

z4-2z-z2-l 

z 

z4-2z2  +  l 

1 


z3-2z2  +  l 


The  inverse  E  1(z)  exists  and  is  given  by 


E~\z) 


1  z  z  z2 

z-1  z(z-l)  1  z 

—z  z  +  l  z  z 

z(z  1)  (z+l)(z-l)  1  1 


The  ratio  H2{z) / H\{z)  —  (—z6  +  z4  +  l)/(z4  +  z2)  —  Bfiz2) / Afiz2)  is  an  even  rational  function  in 
z.  Compare  the  analysis  sides  of  fi,g.  6  and  fig.  18.  The  set  of  filters  on  the  analysis  side  of  the  tree 
structure  in  fig.  18  are  then  given  by  Aq(z)  —  (zG  —  z4  —  z2  —  z  +  1  +  z~1)/(z12  —  2 z8  +  1),  Bq(z )  = 
(z2  —  z~3)/(z16  —  zs  —  2z4—  1),  A\{z)  —  z2  +  z,  and  Bi(z)  —  —z3  +  z2  + 1.  From  the  expression  of  E~1(z) , 
we  obtain  the  synthesis  filters  -Fb(z)  =  z6  +  z4  +  z3  +  z,  Ffiz)  —  z11  +z9  +  z6  —  z5  +  z4  —  z3  +  z2 ,  F2(z)  — 
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(a)  (b) 

Figure  19:  Illustration  of  Lemma  6.1. 


—  z11  +  z7  —  z6  +  z5  —  z  +  1.  The  ratio  F2(z)/Fi(z)  —  {—z6  +  1  )/(z6  +  z4  +  z2)  —  D\(z2) / C\(z2) . 
Comparing  the  synthesis  sides  of  fig.  6  and  fig.  IS,  the  set  of  synthesis  filters  of  the  tree  structure  infi.g. 
18  are  then  given  by  Cq{z)  —  z6  +  z4  +  z3  +  z,  D$(z)  —  z5  —  z  +  1,  C\(z)  —  z3  +  z2  +  z,  D\{z)  —  z3  —  1. 
Thus  we  have  an  equivalent  tree  structured  filter  bank  to  the  given  dyadic  biorthogonal  filter  bank. 


6  Stable  and  FIR  decompositions 


We  now  turn  to  the  following  questions.  Suppose  an  analysis  or  synthesis  bank  comprises  stable 
(respectively,  FIR)  filters,  and  is  decomposable  to  a  tree  structure.  Then  can  the  tree  structure  be 
chosen  so  as  to  comprise  exclusively  of  stable  (respectively,  FIR)  filters?  Lemma  6.1  below  provides  an 
affirmative  answer. 

Lemma  6.1  Suppose  the  structure  in  fi.gure  19(a)  is  equivalent  to  that  in  figure  19(b).  with  all  filter 

1 1 


transfer  functions  rational.  F{z)  is  scalar,  H(z)  — 


Hl(z),  H2(z), 


Hm(z) 


and  G(z )  = 


Gi(z),  G2(z), 


Gm(z) 


are  Mx  1.  Suppose  the  filters  in  figure  19(a)  are  stable  (respectively. 


FIR).  Then  all  filters  in  figure  19(b)  can  also  be  selected  to  be  stable  (respectively,  FIR). 


Proof:  We  will  prove  the  result  for  the  stable  case.  The  FIR  case  is  similar.  The  equivalence  ensures 
that 

H(z)  =  F(z)G(zk). 


Suppose  F(z)  is  unstable.  Write  it  as 


F(*)  = 


with  (3  an  unstable  pole.  Thus  as  H(z)  is  stable,  z  — 


z  —  (3 

(3  must  be  a  common  factor  of  each  Gi(zK).  Since 
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G(zk)  =  G(zk)(zk  -  (3k). 


these  are  rational  in  zK , 


Consequently,  with  /3(z)  the  polynomial  (zR  —  f3K)/(z  —  j, 3 ), 


H(z)  =  [F(z)f3(z)\  [G(zK ')'  . 


Thus  by  replacing  F(z)  by  F(z)(3(z )  and  G(z),  in  fig.  19,  one  retains  the  equivalence  while  removing 
the  unstable  pole  (3  from  F(z),  and  without  adding  new  poles  to  G(z ).  Continuing  this  procedure  it 
follows  that  F(z)  can  be  chosen  to  be  stable. 

Now  suppose  G(z)  is  unstable,  and  at  least  one  Gi(z )  has  the  unstable  pole  rj.  Then  one  can  write, 
in  a  possibly  nonminimal  way: 


G(S)  = 

z1^  —  1] 

where  some  elements  of  G(zK)  may  have  the  zero  77.  It  follows  that  as  F(z)  has  no  unstable  poles,  for 
some  rational  stable  F(z), 


F(z)  =  F(z)(zk  —  ij). 


Thus,  one  can  write 

H(z)  -  [F(z)]  [G(zk)\  . 

Thus  by  replacing  F(z)  by  F{z)  and  G(z),  in  fig.  19,  one  retains  the  equivalence  while  removing  the 
unstable  pole  7]  from  G(z),  and  without  adding  new  poles  to  F(z).  Continuing  this  procedure  it  follows 
that  G(z )  can  be  chosen  to  be  stable. 


A  similar  result  applying  to  synthesis  structures  can  also  be  used.  Using  this  Lemma  one  can  prove 
the  following  strenghtened  version  of  Theorems  4.1  and  5.1.  Note  first  that  an  analysis  or  synthesis 
bank  has  a  stable  inverse  if  its  extended  polyphase  matrix  is  minimum  phase,  that  is,  its  determinant  is 
minimum  phase.  Similarly  an  FIR  analysis  or  synthesis  bank  is  FIR  invertible  if  its  extended  polyphase 
matrix  is  unimodular,  that  is,  has  constant  determinant. 

Theorem  6.1  Suppose  a  stable  (respectively,  FIR)  nonuniform  a7ialysis  bank/ synthesis  baiik  with  ra- 
tional  filters  is  decomposable  mto  a  tree  structure  satisfying  Assumption  f.l.  Then  this  TSAB/TSSB 
can  be  chosen  to  have  all  elements  stable  (respectively,  FIR).  Further  suppose  the  extended  polyphase 
representation  of  the  TSAB/TSSB  is  invertible  and  minimum  phase  (respectively,  unimodular).  Then 
the  inverse  is  a  stable  (respectively,  FIR)  TSSB/TSAB. 
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Proof:  The  first  part  of  the  theorem  follows  from  repeated  application  of  Lemma  6.1.  To  prove 

the  second  part  invoke  Theorem  4.1.  Suppose  the  analysis  bank  is  invertible  and  minimum  phase 
(respectively,  unimodular).  Then  it  has  a  stable  (respectively,  FIR)  inverse,  which  by  Theorem  4.1  can 
be  decomposed  into  a  tree  structure  prescribed  by  the  theorem.  By  the  first  part  of  this  theorem,  that 
tree  structure  must  be  stable  (respectively,  FIR).  The  result  with  respect  to  stably  (respectively,  FIR) 
invertible  TSSB  follows  similarly.  ■ 


7  Conclusions 

Two  principal  results  have  been  presented  relating  biorthogonal  nonuniform  filter  banks  with  tree 
structures:  (i)  That  every  TSAB  is  invertible  iff  its  inverse  can  be  decomposed  into  a  matching  TSSB, 
with  each  matching  level  of  the  resulting  tree  structured  filter  bank,  being  itself  a  biorthogonal  uniform 
filter  bank,  (ii)  That  a  dyadic  analysis  bank  is  conformally  invertible  iff  it  can  be  decomposed  into  a 
TSAB  with  the  2-channel  uniform  filter  banks  on  each  level,  themselves  invertible.  The  second  result 
thus  also  provides  an  easy  test  for  conformal  invertibility  of  dyadic  filter  banks.  All  these  decompositions 
preserve  stability.  If  a  stable  (respectively,  FIR)  analysis  bank  or  synthesis  bank  is  decomposable  into  a 
TSAB  or  synthesis  bank,  then  this  equivalent  tree  structure  can  be  chosen  to  have  all  constituent  filters 
stable  (respectively,  FIR). 

These  results  were  derived  under  the  assumption  that  all  filters  have  rational  transfer  functions. 
Barring  the  results  of  Section  6,  this  assumption  is  in  fact  unnecessary,  and  has  been  invoked  primarily 
to  use  the  Noble  identitities.  Thus,  the  notion  of  extended  polyphase  representation  and  the  results  of 
Sections  3  and  4  can  be  derived  independently  of  the  rationality  assumption.  The  two  key  devices  used 
in  Section  5,  are  (5.19)  and  Lemma  5.1.  The  latter  does  not  assume  rationality,  and  the  former  can  be 
proved  independently  of  rationality  by  using  techniques  similar  to  the  proof  of  Lemma  5.1. 

References 

[1]  A.  Akansu  and  R.  Haddad,  Multiresolution  Signal  Decomposition  -  Transforms,  Subbands ,  and 
Wavelets ,  Academic  Press,  Inc.,  1992. 

[2]  S.  Akkarakaran  and  P.P.  Vaidyanathan,“New  Results  and  Open  Problems  on  Nonuniform  Filter- 
banks",  in  Proceedings  of  ICASSP,  pp  1501-1504,  Mar.  15-19,  1999. 


27 


[3]  M.  Antonini,  M.  Barlaud,  P.  Mathieu  and  I.  Daubechies,  “Image  Coding  Using  Wavelet  transform”, 
IEEE  Transactions  on  Image  Processing ,  pp  205-220,  Apr.  1992. 

[4]  A.  Cohen,  I.  Daubechies  and  J.-C.  Feauveau  “Biorthonormal  Bases  of  Compactly  Supported 
Wavelets”,  Comm,  on  Pure  and  App.  Math.,  pp  485-560,  Jun.  1992. 

[5]  F.  M.  de  Saint-Martin,  P.  Siohan  and  A.  Cohen,  “Application  of  Multiwavelet  Filterbanks  to  Image 
Processing”,  IEEE  Transactions  on  Image  Processing ,  pp  205-220,  Feb.  1999. 

[6]  I.  Djokovic  and  P.P.  Vaidyanathan,  “Results  on  biorthogonal  filter  banks”,  Applied  and  computa¬ 
tional  harmonic  analysis,  vol.  1,  pp  329-343,  1994. 

[7]  P.Q.  Hoang  and  P.P.  Vaidyanathan,  “Non-Uniform  Multirate  Filter  Banks:  Theory  and  Design”, 
Proceedings  of  ISCAS  ,  pp  371-374,  1989. 

[8]  A.  Kirac  and  P.P.  Vaidyanathan,  “On  existence  of  FIR  principal  component  filter  banks”,  IEEE 
International  Conference  on  Acoustics,  Speech  and  Signal  Processing  -  Proceedings,  pp  1329-1332, 
v  3  May  12-15,  1998. 

[9]  G.  F.  Ribeiro  and  G.  V.  Mendonca,  “Image  Coding  using  Biorthonormal  Wavelet  Transform”, 
Midwest  Symposium  on  Circuits  and  Systems  2,  Aug.  13-16,  1995. 

[10]  B.G.  Sherlock  and  D.M.  Monro, “Optimized  Wavelets  for  Fingerprint  Compression”,  ICASSP, 
IEEE  International  Conference  on  Acoustics,  Speech  and  Signal  Processing  -  Proceedings,  pp  1447- 
1450,  May  7-10,  1996. 

[11]  A.  K.  Soman  and  P.P.  Vaidyanathan,  “On  Orthonormal  Wavelets  and  Paraunitary  Filter  Banks”, 
IEEE  Transactions  on  Signal  Processing,  pp  1170-1183,  Mar.  1993. 

[12]  G.  Strang  and  T.  Nguyen,  Wavelets  and  Filter  Banks,  Wellesly-Cambridge  Press,  1996. 

[13]  V.  Strela,  P.  N.  Heller,  G.  Strang,  P.  Topiwala  and  C.  Heil,  “Biorthonormal  Filterbanks  and 
Energy  Preservation  Property  in  Image  Compression”,  IEEE  Transactions  on  Image  Processing, 
pp  548-563,  Apr.  1999. 

[14]  M.K.  Tsatsanis,  G.B.  Giannakis,  “Principal  component  filter  banks  for  optimal  multiresolution 
analysis”,  IEEE  Transactions  on  Signal  Processing,  pp  1766-1776,  Aug.  1995. 

[15]  P.P.  Vaidyanathan,  Multirate  Systems  and  Filter  Banks,  Prentice  Hall,  1992. 


28 


[16]  P.P.  Vaidyanathan,  “Orthonormal  and  Biorthonormal  Filter  Banks  as  Convolvers,  and  Convolu¬ 
tional  Coding  Gain”,  IEEE  Transactions  on  Signal  Processing ,  pp  2110-2130,  Jun.  1993. 

[17]  P.P.  Vaidyanathan,  “Theory  of  optimal  orthonormal  subband  coders”,  IEEE  Transactions  on  Sig¬ 
nal  Processing ,  pp  1528-1543,  Jun.  1998. 

[18]  P.P.  Vaidyanathan  and  A.  Kirac,  “Results  on  Optimal  Biorthonormal  Filter  Banks”,  IEEE  Trans¬ 
actions  on  Signal  Processing ,  pp  932-947,  Aug.  1998. 

[19]  M.  Vetterli  and  C.  Herley,  “Wavelets  and  Filter  Banks:  Theory  and  Design”,  IEEE  Transactions 
on  Signal  Processing ,  pp  2207-2232,  Sep.  1992. 

[20]  M.H.  Yaou  and  W.T.  Chang,  “M-ary  Wavelet  Transform  and  Formulation  for  Perfect  Reconstruc¬ 
tion  in  M-band  Filter  Bank”,  IEEE  Transactions  on  Signal  Processing ,  pp  3508-3512,  Dec.  1994. 

[21]  S.  Akkarakaran,  Filter  Bank  Optimization  with  Applications  in  Noise  Suppression  and  Communi¬ 
cations ,  Ph.  D.  dissertation.  Chapter  6,  California  Institute  of  Technology,  2001. 


29 


A  Novel  Channel  Identification  Method  for 
Wireless  Communication  Systems 


Honghui  Xu1,  Soura  Dasgupta1,  and  Zhi  Ding2 
Electrical  &  Computer  Engineering  Department 
The  University  of  Iowa 
Iowa  City,  IA-52242,  USA. 

2 Department  of  Electrical  &  Computer  Engineering 
University  of  California 
Davis,  CA  95616,  USA. 

Emails:  hoxu,  dasgupta@engineering.uiowa.edu,  zding@ece.ucdavis.edu 
Keywords:  Identification,  Equalization,  Feedback,  Wireless,  Training 


A'Vork  supported  in  part  by  NSF  grants  ECS-9970105  and  CCR-9973133. 
2 Work  supported  in  part  by  NSF  grant  C’CR-9996206 


August  i,  2002 


DRAFT 


Abstract 


We  present  a  novel  dual  channel  identification  approach  for  mobile  wireless  communication  systems. 
Unlike  traditional  channel  estimation  methods  that  rely  on  training  symbols,  we  propose  a  bent-pipe  feedback 
mechanism  which  requires  the  mobile  station  (MS)  to  send  portions  of  its  received  signal  back  to  the  Base 
Station  (BS)  for  wireless  channel  identification.  Using  a  filter-bank  decomposition  concept,  we  introduce  an 
effective  algorithm  that  can  identify  both  the  forward  and  the  reverse  channels  based  only  on  this  feedback 
information.  This  new  method  permits  transfer  of  computational  burden  from  the  MS  to  the  resource  rich 
BS  and  leads  to  significant  savings  in  bandwidth  consuming  training  signals. 

I.  Introduction 

We  propose  a  new  approach  to  the  estimation  and  compensation  of  forward  link  channels  in 
mobile  wireless  communication  systems  that  centers  on  a  novel  bent  pipe  feedback  mechanism. 
In  principle,  this  feedback  mechanism  enables  Base  Stations  (BS  )  to  simultaneously  estimate 
both  the  Forward  Link  Channel  (FLC)  from  the  BS  to  a  Mobile  (MS  )  and  the  Reverse  Link 
Channel  (RLC)  from  the  MS  to  the  BS.  without  any  training  signals  or  resorting  to  blind 
estimation  techniques.  While  practical  realities  temper  these  theoretical  expectations,  as  we 
will  demonstrate  in  this  paper,  our  techniques  bring  with  them  certain  significant  advantages. 

Our  paper  is  motivated  by  the  strong  surge  in  mobile  wireless  communication  systems. 
The  rapidly  expanding  arena  of  wireless  services  including  mobile  computing  and  broad¬ 
band  multimedia  would  require  much  higher  wireless  capacity  and  higher  data  rates  than 
is  currently  needed,  over  the  often  unreliable  wireless  medium.  Indeed,  unlike  their  wire- 
line  counterparts,  wireless  communication  links  are  highly  susceptible  to  channel  variations, 
particularly  in  mobile  environments.  Three  specific  advantages  motivate  our  approach. 

I A )  Adaptive  Coding 

The  high  variability  of  wireless  links  poses  a  serious  challenge  to  assuring  quality  of  service 
(QoS)  to  various  traffic  flows  under  harsh  and  dynamically  distortive  channel  conditions. 
To  address  QoS  needs,  future  wireless  communication  systems  must  respond  to  potentially 
rapid  channel  changes  in  an  effective  and  timely  manner.  Currently,  FLC  estimation  and 
ecpialization  responsibilities  are  assigned  almost  entirely  to  the  MS  [1],  [2],  [3].  At  the  same 
time,  to  improve  adaptivity  to  channel  conditions,  several  researchers  have  proposed  a  num¬ 
ber  of  schemes  that  require  at  least  a  partial  awareness  of  the  FLC  at  the  BS.  Thus,  Paulraj 
et.  al..  [6]  demonstrate  substantially  improved  performance  through  the  use  of  adaptive 
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space  time  coding  that  assumes  that  the  BS  has  partial  knowledge  of  the  FLC.  Goeckel  et. 
al.  [10]  propose  other  forms  of  adaptive  coding  again  relying  on  similar  information.  Bit 
loading  in  OFDM  systems  similarly  assumes  a  channel  aware  BS,  [8],  as  indeed  does  the 
power  minimization  scheme  of  [9].  Similarly  precoding  techniques  [11]  can  be  significantly 
improved  should  the  BS  have  partial  knowledge  of  the  FLC.  Our  bent  pipe  feedback  tech¬ 
nique  naturally  apprises  the  BS  of  the  prevailing  FLC  conditions  that  are  needed  by  these 
schemes . 

(B)  Shared  FLC  compensation 

As  noted  earlier,  the  tasks  of  FLC  compensation  and  estimation  are  often  assigned  to  the  MS. 
It  is  recognized  that  most  future  wireless  cellular  services  will  be  characterized  by  an  FLC 
that  supports  higher  data  rates  than  does  the  1!  1.0.  For  a  given  channel,  higher  data  rate 
induces  longer  delay  spread  and  more  severe  ISI.  In  other  words  the  discrete  time  baseband 
model  is  of  a  higher  order,  and  its  equalization  and  compensation  more  onerous.  At  the 
same  time  it  is  the  BS  that  houses  greater  computational  reserves,  even  though  it  is  assigned 
the  less  burdensome  task  of  estimating  and  compensating  the  RLC.  It  is  thus  desirable  to 
shift  at  least  a  part  of  the  FLC  compensation  and  estimation  burden  to  the  more  resource 
rich  BS.  The  feedback  mechanism  we  propose  permits  a  better  utilization  of  this  resource 
disparity,  by  transferring  much  of  the  FLC  estimation  and  compensation  tasks  to  network 
nodes  with  greater  resources. 

More  specifically,  our  method  in  principle  permits  the  BS  to  estimate  and  pre-compensate 
the  wireless  channel.  In  practice,  because  of  roundtrip  delays  and  channel  variations  that 
occur  within  the  resolution  of  such  delays,  the  BS  will  only  partially  compensate  the  dynamic 
channel,  and  the  MS  must  take  part  in  combatting  the  residual  ISI.  Nonetheless,  this  partially 
compensated  channel  will  induce  reduced  levels  of  ISI,  whose  removal  would  consequently 
impose  far  less  computational  burden  on  the  MS. 

(C)  Reduced  Training 

Channel  estimation  at  the  MS  is  typically  assisted  by  the  frequent  transmission  of  training 
data.  As  the  FLC  supports  higher  data  rates,  and  is  consequently  described  by  a  higher  order 
discrete  time  model,  it  requires  longer  training  sequences,  transmitted  more  frequently.  This 
is  obviously  at  the  expense  of  the  all  too  precious  forward  link  bandwidth.  In  GSM  for 
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example  roughly  one  sixth  of  the  transmission  time  is  devoted  to  training  signals. 

As  noted  in  (B),  due  to  roundtrip  delays  our  scheme  permits  the  the  BS  to  only  partially 
compensate  the  FLC,  and  the  MS  must  combat  the  residual  ISI  with  the  assistance  of  some 
training.  Nonetheless,  this  partially  compensated  channel  suffers  from  reduced  levels  of  ISI. 
As  we  demonstrate  through  simulations  in  a  later  section,  the  number  of  training  symbols 
needed  for  the  estimation  and  compensation  of  the  partially  compensated  FLC  is  significantly 
lower,  with  consequent  savings  in  the  bandwidth  allocated  to  FLC  training.  This  advantage 
is  buttressed  by  the  fact  that  no  training  is  needed  on  the  RLC.  and  that  instead  the  time 
slot  normally  used  for  RLC  training  can  be  used  for  the  feedback  data.  ■ 

It  should  be  noted  that  2G  CDMA  systems  and  emerging  3G  systems  do  employ  some 
simple  feedback.  Such  feedback,  however,  is  often  limited  to  an  estimated  power  loss  param¬ 
eter  that  enables  the  BS  to  compensate  multipath  distortion  via  power-control.  Although 
power-control  or  higher  SNR  can  improve  MS  performance,  particularly  against  flat  fading 
channels,  channel  distortion  as  a  result  of  multipath  fading  cannot  Ire  efficiently  compensated 
by  mere  increase  of  transmission  power.  Other  proposals  for  channel  information  feedback 
are  in  [4]- [6],  which  involve  feeding  back  channel  parameter  estimates  to  the  BS  at  appropri¬ 
ate  instants  of  time.  These  proposals  continue  the  practice  of  assigning  the  sole  responsibility 
of  channel  estimation  to  the  resource  challenged  MS.  and  do  not  reduce  the  training  burden 
on  the  FLC. 

Unlike  the  conventional  feedback  of  channel  estimates  in  [4],  [5],  our  new  approach  only 
requires  that  the  MS  feed  back  to  the  BS  a  portion  of  the  received  signal,  over  the  time  slot 
conventionally  reserved  for  RLC  training,  in  epochs  where  either  it  normally  transmits  or  in 
epochs  where  it  detects  a  performance  degradation.  Clearly,  this  permits  the  BS  to  estimate 
the  Roundtrip  Channel  (RTC). 

However,  the  key  novelty  of  our  approach  lies  in  the  following  discovery:  By  feeding  back 
only  a  portion,  rather  than  the  entire  received  signal,  one  empowers  the  BS  to  'identify  both 
the  FLC  and  the  RFC  from  the  roundtrip  feedback  signal  alone.  This  novel  channel  feedback 
does  not  require  high  speed  reverse  links  and  can  naturally  accommodate  asymmetric  data 
link  structures.  Furthermore,  no  additional  training  signals  are  necessary  for  estimating  the 
RLC  at  the  BS.  As  will  Ire  explained  further,  this  scheme  requires  no  greater  overhead  than 
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those  associated  with  [4]- [6],  while-  at  the  same  time  having  the  fundamental  advantage  of 
shifting  substantial  processing  burden  from  the  MS  to  BS.  and  reduced  levels  of  training 
data . 


BS  Transceiver  Wireless  propagation  MS  Transceiver 


L _ JL _ JL _ J 

Fig.  1.  A  wireless  communication  system  with  signal  feed  back. 


Our  paper  is  structured  as  follows.  First,  the  basic  principle  of  roundtrip  feedback  is  pre¬ 
sented  in  Section  II.  Its  implementation  issues  are  addressed  and  practical  caveats  analyzed. 
Conditions  under  which  RLC  and  FLC  can  Ire  obtained  from  the  RTC  are  in  Section  III. 
An  algorithm  to  estimate  the  roundtrip  dynamics  is  formulated  in  Section  IV.  Sect  ion  V 
provides  the  unravelling  algorithm.  Simulations  are  in  Section  VI. 

II.  Bentpipe  Feedback 

The  feedback  scheme  we  use  is  depicted  in  Fig.  1.  Specifically  in  this  figure  the  FLC  and 
RLC  respectively,  operate  at  the  rates  1/Tj  and  l/TV  with  Ti  <  T-j  and  Tx/To  =  L/M 
for  integers  L  and  M  that  need  not  be  coprime.  The  sampled  data  at  the  FLC  output  is 
converted  to  the  RLC  rate  by  the  decimal  or/interpolator  combination  depicted  in  the  figure. 
This  is  interlaced  in  the  stead  of  normal  RLC  training  data,  with  the  data  that  the  MS  needs 
to  transmit  through  the  RLC  in  its  normal  course  of  operation,  and  fed  back  to  the  MS. 


Fig.  2.  The  equivalent  digital  expression  of  the  system  (noise  and  interference  considered) 

Fig.  1  can  be  transformed  into  Fig.  2.  where  x(n)  is  the  digital  data  sequence  transmitted 


5 


by  BS  at  time  riT]  at  the  rate  1/Tj,  y(n)  is  the  data  sequenee  received  by  BS  at  time  11T2 
after  sampling  at  the  rate  l/To.  and  h(n)  and  g(n)  are  the  FIR  impulse  response  of  the 
FLC  and  RLC  respectively.  Further,  u’-j(n)  and  10-2(11)  are  the  noise  sequences  at  the  FLC 
and  RLC  outputs,  and  u(n)  models  the  Interference  caused  by  the  normal  RLC  data  due  to 
imperfect  synchronization.  Throughout  we  make  the  following  standing  assumption. 

Assumption  1:  The  signals  .1(11 1.  u(n),  w-i(n).  10-2(11)  are  zero  mean,  white  and  mutually 
uncorrelated. 

It  is  clear  that  given  that  the  BS  is  aware  of  the  data  it  has  transmitted,  under  assumption 
1  it  can  estimate  the  RTC.  Using  ideas  from  [12],  we  show  that  under  mild  assumptions 
on  the  FLC  and  RLC,  the  BS  can  in  fact  directly  unravel  the  FLC  and  RLC  from  the 
RTC,  'without  any  training  in  either  FLC  or  RLC.  This  ability  to  separate  RLC  and  FLC 
from  RTC  estimate  can  Ire  seen  as  a  consequence  of  the  rate  changing  mechanism  that 
separates  the  two  channels.  Rate  changers  are  time  varying  systems,  and  consequently,  even 
when  w\(n)  =  10-2(11}  =  u(n)  =  0.  the  LTI  operators  h(n)  and  g(n)  cannot  be  arbitrarily 
interchanged. 

We  now  pose  and  answer  two  questions  surrounding  the  practicality  of  this  approach.  In 
doing  so  at  appropriate  places  we  discuss  the  data  and  computational  overheads  associated 
with  this  scheme  relative  to  those  of  the  feedback  schemes  mentioned  in  [4]- [6]. 

(i)  What  about  roundtrip  delay?  Over  reasonable  distances  the  roundtrip  delay  is  in 
tens  of  microseconds.  Thus,  for  example,  over  a  roundtrip  distance  of  5  km,  this  delay  is 
about  16.67  ps.  Over  such  time  spans,  the  environment  as  seen  by  the  mobile  unit  undergoes 
little  change.  This  is  underscored  by  the  fact  that  in  GSM  each  data  frame  has  a  duration 
of  557  /i-5,  and  training  occurs  only  once  per  data  frame.  Thus  the  channel  variation  within 
the  resolution  of  this  delay  occurs  mainly  because  of  Doppler  effect.  Still,  even  with  Doppler 
effect  on  high  speed  mobiles,  the  channel  characteristics  are  unlikely  to  change  drastically 
within  such  a  short  time  interval.  As  a  case  in  point,  a  vehicle  traveling  at  lOOkm/hr  suffers 
a  maximum  Doppler  shift  of  50  Hz  at  cellular  band.  The  channel  variation  thus  endured  will 
not  bf ’.enough  to  prevent  the  transmitter  from  substantially  compensating  the  FLC.  Thus 
the  residual  ISI  that  must  be  equalized  at  the  receiver  will  be  significantly  milder  leading  to 
the  need  for  much  shorter  training  sequences  on  the  FLC.  Given  that  no  training  is  needed 
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on  the  RLC,  and  that  feedback  data  occupies  the  RLC  training  slot  used  in  conventional 
communication,  this  implies  substantial  savings  in  the  bandwidth  devoted  to  the  overall 
training.  Simulations  presented  later  support  this  contention.  Since  the  BS  has  greater 
computational  reserves,  this  would  also  effect  a  beneficial  transfer  of  computational  burden 
to  such  a  resource  rich  BS. 

We  should  also  note  that  the  feedback  schemes  suggested  in  [4]- [6]  would  suffer  the  same 
latency  effects  manifested  by  the  roundtrip  delay.  Thus  while  the  addition  of  the  other  ad¬ 
vantages  of  our  scheme  are  conspicuously  absent  in  the  approaches  of  [4]- [6]  the  effectiveness 
of  adaptive  coding  noted  in  the  introduction  should  be  comparable  in  both  instances. 

(ii)  What  about  battery  life?  Overly  frequent  feedback  transmissions  may  deplete  power 
resources  at  the  receiver.  In  epochs  where  the  receiver  also  transmits,  the.  training  data 
currently  sent  can  be  substituted  by  the  feedback  data,  as  RLC  training  is  no  longer  needed. 
During  silent  uplink  episodes,  feedback  can  be  restricted  to  epochs  where  the  receiver  detects 
a  high  level  of  packet  loss  due  to  obsolete  channel  estimates,  in  much  the  same  way  as 
existing  proposals  for  channel  estimate  feedback  'require.  Thus,  the  associated  overhead  is 
again  comparable  to  that  in  [4]- [6]. 

Taking  a  more  long  range  view,  whereas  battery  technology  continuously  improves,  with 
ever  lengthening  battery  life,  bandwidth  resources  will  remain  scarce.  Given  the  bandwidth 
savings  reduced  training  brings  about,  one  can  expect  bent  pipe  feedback  to  gain  an  increas¬ 
ing  advantage  in  this  trade-off.  ■ 

III.  Conditions  for  estimating  RLC/FLC  from  RTC 

Borrowing  ideas  from  [12]  we  show  that  it  is  possible  in  principle  to  unravel  the  FLC  and 
RLC  dynamics  from  the  RTC.  We  consider  the  case  without  noise  and  user  interference  at 
first.  The  case  with  noise  and  interference  will  be  treated  later. 

Consider  the  polyphase  representation  of  the  FLC  and  RLC  transfer  functions  H(z)  and 
G(z)  [13],  given  below. 


M  —  1 

H(z)  =  Y.  Hi{zM)z~i 
*=0 

(1) 

L—l 

G(z)  =  Y  Gi(zL)z~(L~1~i]. 

i= 0 

(2) 

In  the  sequel  //.( : )  and  Gt(z)  will  be  called  the  polyphase  components  of  H(z)  and  G(z), 
respectively.  Then  with 

x,;(n)  =  x(nM  —  i)  and  y;(n)  =  y(nL  +  L  —  1  —  /),  (3) 

we  can  redraw  Fig.  2  as  Fig.  3. 

Thus,  in  principle  the  knowledge  of  x(i)  and  y(i)  estimates  the  rank  one  matrix  transfer 
function 


F(z)  = 


Gq(z) 

Gi(z) 

Gl-  i(~  ) 


H0(z |  //•<.:) 


Hm-i(z  ) 


(4) 


Fig.  3.  The  polyphase  representation  of  the  system. 


Now  we  will  need  one  of  the  following  two  assumptions  (both  need  not  be  satisfied). 

Assumption  2:  The  greatest  common  divisor  (god)  of  the  set  of  polynomials  H;(z )  is  a 
pure  delay  z~d  (cl  integer).  Further  their  maximum  order  is  known. 

Assumption  3:  The  gccl  of  set  of  the  set  of  polynomials  f§§(  *:)  is  a  pure  delay  z~d  (cl  integer). 
Further  their  maximum  order  is  known. 

Assumption  2  for  example  is  quite  common  in  fractionally  spaced  blind  equalization,  [15], 
[16].  Further  pure  delay  common  factors  in  H;(z)  and  (?,(.:)  result  from  delays  in  H(z)  and 
G(z),  respectively,  though  the  converse  need  not  be  true.  In  particular  only  delays  of  M 
(respectively  L)  or  larger  result  in  the  Hi(z)  (respectively  Gi(z))  having  a  pure  delay  as  a 
common  factor.  Then  we  have  the  following  Theorem. 


Theorem  1:  Suppose  one  or  both  of  the  two  assumptions  2  or  3  hold.  Then  the  matrix  in 
(4)  gives  H{z)  and  G(z)  to  within  a  jcommon  nonzero  scaling  factor  and  a  delay  as  long  as 
H(z)G(z)  ±  0. 

Proof:  Suppose  assumption  2  holds.  Observe,  that  one  has  available  Gq(z)H;(z). 

i  G  {0.  •  •  •  .  M  —  1}.  Then  the  gccl  of  G0(z)Hi(z)1  i  G  {0,  •  •  •  ,  M  —  1}  is  az~dG0( z)  for  some 
scalar  a  and  integer  d.  Thus  by  dividing  by  this  gccl  one  obtains  the  Hfz)  to  within  a  scalar 
and  delay,  and  hence  also  the  G,,(A).  The  relation  to  assumption  3  can  be  similarly  deduced. 

■ 

Should  however,  the  Hfz)  and  G:,{z)  have  respective  common  factors  1  — a*-1  and  1  —  3z~y. 
a  7^  3.  then  these  can  be  exchanged  between  the  G.fz)  and  Ht{z)  in  (4)  without  changing 
the  input  output  relationship  exemplified  by  (4).  Consequently  the  G;(z)  and  Hfz)  cannot 
be  separately  extracted.  This  of  course  does  not  show  how  the  H(z).  G(z)  can  be  determined 
to  within  a  scaling  factor.  The  sequel  provides  such  algorithms.  Note.  [12]  does  not  have 
these  algorithms. 

IV.  Estimating  the  roundtrip  dynamics 

In  this  section,  we  describe  how  the  roundtrip  dynamics  can  be  estimated  at  the  BS  using 
the  feedback  information.  To  this  end,  in  Section  IV-A  we  provide  a  z-domain  description 
of  the  relation  between  x(n)  and  y(n).  Section  IV-B  explains  how  this  description  leads  to 
an  estimation  algorithm. 

A.  An  input  output  relationship 

Consider  now  Fig.  2  with  nik)  the  interference  caused  by  the  normal  RLC  data  due  to 
imprecise  synchronization.  Adopt  the  standard  notation  of  representing  the  z-transform  of 
signals  represented  by  small  letters  such  as  a(n),  by  capital  letters,  e.g.  A{z). 

Define. 

w-2 k(n)  =f  w-2{nL  +  L  —  k  —  1),  0  <  k  <  L  —  1  and  Uk{n)  '=  u(nL  —  k).  0  <  <  L  —  1. 

Then.  [13] 

L- J  M—  1 

Y{z)  =  Y,  z~(L-l~k)  Yk(zL)  and  X(z)  =  Y  z~hXh(zM). 

k= 0  k=0 
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Likewise, 


L- 1 


L- 1 


W2(z)  =  E  z-(L-1-k)W$zL I  and  [/(.:)  =  E  rkUk(zL). 


k~0 


k= 0 


Thus  we  can  rewrite 

L— 1 


L-il 


L-r 


k~0 


k= 0 

L— 1 L— 1 


fc  =  0 


EE^-MHfef)^&L) 

fc=o  i=o 


Equating  coefficients  on  both  sides  of  (5)  one  obtains: 
/  >o('-)  \  /  Go(-)  X 


^  \  /'  bI%(G  N 


VT--)  +  G(.::) 


V  y 


+ 


V  W2(L- i)(i|  ) 


where 


G(-)  = 


Go(-z) 

G^z) 

Gl-i{z) 

G  i(-) 

Gb("-) 

••  .--,Go(-) 

Gl-  i  ( *- ) 

z  1Gq{z)  ■ 

••  :'-~lGL- 2{z) 

Observe, 


M— 1 


$[*)  =  E  Xj^)HAz)  +  WTo(~) 

j—0 


(5) 


(G) 


(7) 


(8) 


where 


rcio(?i)  =  u’i(Afn). 


Thus  one  has 


<  m~)  'i 

(  A'o(;:)  ^ 

(  U0(z)  ^ 

(  W2Q(z)  ] 

=  F(;.) 

+  G(z) 

+ 

^  n-i<->  J 

\  j 

\  UL-i(z)  J 

V  W2(L-l){z)  ) 

where  F(z)  is  given  by  (4). 


(9) 


(10) 
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B.  Estimation  of  BTC 


Rouncltrip  dynamics  estimation  involves  the  estimation  of  F(z)  from  (10).  The  key  thing 
to  note  about  the  estimation  of  F(z)  is  that  under  Assumption  1,  .rik)  is  uncorrelated  with 
u>io (??.),  Vifti)  and  »;2, ■(/?.).  Thus  one  can  express  (10)  as 


1  r„( z)  \ 

(  A'0  (z)  ' 

=  F(z ) 

\  YL- i(~  )  j 

\  Am-i(^)  ) 

where  7 (n)  is  uncorrelated  with  x(n). 

+  r(,) 


(ii) 


Suppose  the  order  of  H(z)  is  lH  and  the  order  of  G(z)  is  Iq.  In  (1).  Hfz)  =  hfn  1: 
where 


hi(n)  =  h(Mn  +  i)  0  <  i  <  M  —  1, 0  <  (Mn  +  i)  <  ln  (12) 

Similarly,  in  (2),  Gfz)  =  9iin)z~n ,  where 


g.j{n)  =  g(Ln  +  L  —  1  —  i)  0  <  i  <  L  —  1, 0  <  (Ln  +  L  —  1  —  i)  <  Iq  (13) 

By  padding  zeros,  let  the  length  of  the  sequence  of  coefficients  of  //,( : ;  be  //,  +  1  and  that 
of  Giff)  be  lg  +  1,  where  //,  and  lg  arc  the  maximum  orders  of  polynomial  set  Hfz)  and 
polynomial  set  G;(z)  respectiveley.  They  are  related  to  lH  and  Iq  as  follows: 


hi  +  1  —  | '{hi  +  1)/M] 

hi  +  1  =  fe  +  1  )/L] 


where  stands  for  the  smallest  integer  that  is  greater  than  or  equal  to  x. 

Let  / j  —  I,,  +  /;,  and  define 

iy 

Fij(z)  =  Gj(z)H:j(z)  =  J2  foik)z~k  (14) 

k= o 

For  some  integer  N  to  be  specified  in  a  later  section,  the  Toeplitz  filtering  matrix  of  Fij(z) 
is  defined  as 


TN(Fl3) 


'  MO)  •••  Mh)  o  ' 

o  MO)  -  ■  -  Mh)  ) 


(15) 
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TN(Fjj)  is  an  N  x  {If  +  N)  matrix. 
Denote 


X-t{k)  =  [k),  ■  ■  ■  ,  Xj(k  —  If  —  N  +  l)]7. 0  <  t  <  M  —  1  (16) 

X(  k)  =  {Xj(k).-...Xl,_l(k)]T  (17) 

=  us) 

By  aligning  M  matrices  Tpf(Fij),  we  get  the  block  Toeplitz  matrix 

T.;{N)  =  [TN(Fi o).  •  •  •  ,TKiFi(M-u)\  (19) 

Hence,  equation  (10)  can  be  expressed  as 

W  =  Ti{N)X{k)  +  T){k)  0  <  i  <L,  (20) 

where  i](k)  is  uncorrelated  with  X{k).  The  inverse  of  Rxx  =  E{XXT)  exists  because  of 
assumption  1.  Then  we  have 

FAN)  =  RyiXR-x\  (21) 


where  RyiX  =  E(y;XT)  and  E(-)  denotes  the  expectation  operation.  Observe,  the  estimation 
of  J~i ( N )  for  each  i  in  {0. 1,  •  •  •  L  —  1}  provides  F(z)  accomplishing  the  estimation  of  the 
roundtrip  dynamics. 

V.  Unraveling  FLC  and  RLC  from  the  estimated  RTC 

We  now  demonstrate  how  the  estimate  of  J-;{N)  obtained  in  the  previous  section  can  be 
used  to  separate  th|  FLC  and  RLC  dynamics.  For  simplicity  sections  V-A  and  V-B  provide 
unraveling  algorithms  under  assumptions  3  and  2  respectively,  with  the  added  assumption 
that  the  cognizant  polyphase  components  do  not  even  share  delays.  Recall  that  this  does 
not  necessarily  imply  the  absence  of  delays  in  the  pertinent  channels  but  rather  that  delays 
are  smaller  than  M  or  L  depending  on  whether  the  channel  in  question  is  the  FLC  or  RLC. 
Methods  of  accomodating  common  delay  factors  among  the  polyphase  components  are  in 
section  V-C. 
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A.  Algorithm  when  the  Gi(z)  are  coprime  and  do  not  .share  a  delay 

Similar  to  the  definition  of  Ty(igj)  in  equation  (15),  we  can  define  T^(Hj)  and  Ty(G,) 
as  the  Toeplitz  filtering  matrix  associated  with  Ht(r.  )  and  Gi{z)  respectively.  7 has  a 
dimension  N  x  (//,  +  N)  and  Ty(G,)'s  dimension  is  N  x  (lq  +  N).  Define  the  LN  x  (N  +  lg ) 
block  Toeplitz  matrix, 

MG  o) 

Tn(GA 

Tn{Gl_  i ) 

We  note  first  the  following  well  known  fact.  [16]: 

Fact  1:  If  the  G;(z)  have  no  common  factors,  not  even  delays,  then  for  a  sufficiently  large 
N  specified  in  [16],  Q  has  full  column  rank.  Further,  the  left  null  space  of  Q  provides  Gj(z) 
to  within  a  scaling  constant. 

In  the  sequel  choose  N  to  be  the  smallest  integer  for  which  Q  has  fall  column  rank.  Now 
define  the  ( /,,  +  N)  x  +  N)  matrix 

[  T, .  \  >//,.;  Tlgj#(Hi  |  •••  W, .  vG/m  ;  ]  •  (23) 

Note  that  the  following  relation  holds 

Tn{F$  =  TN(Hj)Tih+N(Gt)  =  TN{Gi)Tlg+N(H $  (24) 

Consequently,  one  obtains: 

T  =  GH  (25) 

Define 

T-\TwS)'  .---.T,  (  A" )' ' 1  (26) 

and  recall  that  section  IV  provides  F . 

It  is  evident  that  Ft  is  full  row  rank.  Therefore,  the  column  vectors  of  F  spans  the  same 
subspace  defined  by  the  column  vectors  of  Q .  referred  to  in  the  secpiel  as  the  signal  subspace. 
More  importantly  the  left  null  space  of  F  is  identical  to  the  left  null  space  of  Q .  Fact  1  then 
sets  up  the  direct  estimation  of  G{z)  from  the  left  null  space  of  the  estimate  of  F  provided 
in  the  previous  section. 
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Allowing  for  estimation  errors  because  of  finite  data  records,  call  T  the  estimate  of  T 
provided  by  the  algorithm  in  the  previous  section.  Denote  the  error  matrix  as  A f ,  i.e. 

f  =  GH  +  AT. 


Hence  we  have  the  following  result  after  performing  a  singular  value  decomposition  (SVD) 
of  T\ 


(  S,  0  ^ 

{  wf  \ 

t  =  GH  +  N={  Vs  Vn ) 

l 0  J 

u 

l ) 

(27) 


where  (-)71  stands  for  the  Hermitian  transpose,  the  column  vectors  in  Vs  span  the  signal 
subspace,  and  the  column  vectors  in  Vn  span  the  left  null  space  of  T  and  hence  of  Q.  The 
dimension  of  Vs  and  1  j,  are  LN  x  (lg  +  Ar)  and  LN  x  (LN  —lg  —  N). 

If  there  is  no  estimation  error,  the  t-th  column  p,  of  I  j,  satisfies 


pfG  =  0  (28) 

where  p;  =  [p,:(0),  •  •  •  ,p;(LN  —  1)]T.  In  practice,  when  estimation  errors  exist,  (28)  can  be 
solved  in  the  least  square  sense,  i.e.  by  minimizing  the  following  quadratic  form 

LW-lg-N-  1 

9(g)  =  E  \pfs  I2,  (29) 

*=0 

where 

g  =  [<7o(0)w  ,  </0(/9),  -  -  -  .:fjL-i{0).- ■■  ,()L-\{lg)].  (30) 

In  order  to  solve  for  g  in  equation  (29).  we  follow  the  method  in  [16].  Similar  to  the 
definition  of  7 AflTp),  we  define  the  filtering  matrix  Tig+\{P^A  associated  with  Pij(~)1  0  <  i  < 
LN  —  lff  —  N  —  1 , 0  <  j  <  L  —  1 .  where 

N-l 

Pij(z)=  Ea'./  V  +  A;:  :  (31) 

k= 0 

The  dimension  of  Tig+i(P;j)  is  (/,,  +  1  :  x  ( /,,  +  N).  Define 

v,  =  p),+l(P,o  )t.'".D,+i(P,(,,-11)t]t  (32) 

Then  equation  (29)  can  be  transformed  into 

LN-lg- Ar-1 

q(g)  =  gPgH ,  where  P  =  ^  V{Pf  (33) 
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The  solution  of  g  is  the  smallest  eigenvalue  of  matrix  P.  Note:  that  g  consists  of  all  the 
coefficients  of  G(z).  Thus  the  RLC  is  identified  up  to  a  multiplicative  constant.  From  the 
estimated  G(z),  we  can  construct  Q  as  in  (22).  From 

H  =  G  r  (34) 

where  (•)'  indicates  pseudoinverse,  we  can  also  identify  the  FLC  up  to  a  multiplicative 
constant. 

B.  Algorithm  when  the  II,  [ z )  are  coprime  and  do  not  share  a  delay 

The  foregoing  algorithm  deals  with  the  case  when  assumption  3  is  satisfied  together  with 
the  stronger  condition  that  the  polyphase  components  Gj(z)  do  not  share  any  delays.  In  this 
case,  we  identify  G(z)  first,  and  then  identify  / /'(. :  1  from  the  estimated  G(z).  Now  suppose 
that  assumption  2  is  satisfied  together  with  the  condition  that  the  polyphase  components 
Hj(z)  do  not  share  any  delays. 

Define  for  some  integer  N 

mm 
mm. 

'T\iI1\i  ) 

Under  assumption  2  and  the  lack  of  common  delay  factors  among  the  H;(z)  there  exists  an 
N  for  which  7 i  has  full  column  rank.  Choose 


^UN)  =  [Tv(F0;).....T^(jF(L_1,)] 

.r-  [./!,(. V rw  (  V :7]7 


and  an  algorithm  very  similar  to  that  in  Section  V-A  accomplishes  the  desired  unraveling. 
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C.  Treating  common  delay  factors 

We  now  consider  the  situation  where  the  pertinent  polyphase  components  may  have  com¬ 
mon  delay  factors.  We  will  explain  the  underlying  ideas  with  Assumption  3  in  force.  Similar 
ideas  apply  to  the  ease  where  Assumption  2  holds. 

Suppose  that  Gt(z)  =  z~]Gfz)  where  the  G,-(~)  havg  no  common  factors,  i.e.  z~l  is  the 
gccl  of  the  G{(z).  Suppose  Hfz)  =  z~n'  H;{z).  Then  because  of  the  upper  triangular  Toeplitz 
nature  of  Tv(G, )  and  and  because  of  (24).  has  /  +  n columns  that  are  zero 

vectors.  These  zero  columns  also  appear  in  T.  The  matrix  T  obtained  by  removing  these 
zero  columns  can  be  expressed  as 

T  =  GH 

where  Q  in  particular  is 

Tn  ( Go ) 

'Tn  ( G\ ) 

Tn(GL-i) 

and  as  the  Gt(z )  have  no  common  factors  has  full  rank  and  has  identical  null  space  as  T . 
Consequently  the  algorithm  in  section  V-A  working  with  T  rather  than  T  suffices  to  estimate 
G(z)  and  H(z )  to  within  a  scalar  and  delay  ambiguity. 


VI.  Simulations 


We  present  two  simulation  examples,  the  first  to  illustrate  the  basic  performance  of  the 
algorithm,  and  the  second  to  illustrate  the  reduction  of  training  levels  needed  on  the  FLC 


when  the  channel  parameters  change  with  time.  Before  presenting  these  simulations  we  make 
precise  our  notions  of  SNR  and  SIR  (caused  by  the  interference  effect  of  poorly  synchronized 
RLC  data.) 

Denote  noise,  interference  and  signal  at  the  output  of  RLC  as  w(n),  v0{n)  and  z(n) 
respectively.  We  can  see  y(n)  =  z(n)  +  w(n)  +  //,,( n ).  SNR  and  SIR  are  defined  as 


E(zHn)) 

E(w2(v)) 


SIB  = 


E(--2(n)) 

E(v2(n)) 


(37) 

(38) 
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If  the  variance  of  x(n),wi(n),w-2{n)  and  u(n)  arc  .  cr^,o  and  a~u  respectively,  and  if  the 

signals  are  wide  sense  stationary  (WSS).  by  a  derivation  given  in  the  appendix,  we  have 


SNR  = 


<nT.!=,!  £L<,  f!,(k) 

(39) 

<4  !&,**<*)  + Ml, 

lx,Cu  <m 

(40) 

SIR  = 

where  fij(k)  is  defined  in  (14)  and  Iq  is  the  order  of  RLC. 

Simulation  example  I:  The  FLC  and  RLC  are  generated  from  two  delayed  raised-cosine 
pulse  C'(t.a).  where  a  is  the  roll-off  factor.  C(t)  is  limited  in  8T  for  FLC  and  in  6T  for 
RLC,  where  T  is  the  symbol  interval. 


FLC  =  0.1C(f,  0.25)  +  Q.8C(t  —  T/2,  0.25)  (41) 

RLC  =  0.oC(t,  0.10)  -  0.7 C(t  -  T/3.0.10)  (42) 


Their  corresponding  discrete  time  channel  coefficients  are 


FLC  =  [0.0129.  -0.0326,  0.0693,  -0.1485.  0.6019. 0.5019.  -0.1485.  0.0693.  -0.0326]  (43) 

RLC  =  [0.0521,  -0.0786.  0.1423.  -0.0783.  -0.2882.  0.1128,  -0.0677]  (44) 

We  use  down  sampling  factor  M  =  3  and  upsampling  factor  L  =  2.  Hence  lh  =  3  and  lg  =  4. 
Noise  signal  w\  (n)  and  -u^n)  are  zero  mean  and  have  the  same  variance.  The  input  signal  is 
i.i.d  BPSIv.  We  focus  on  estimating  and  compensating  the  FLC  in  the  interference  free  case. 

To  quantify  the  quality  of  channel  estimation,  we  define  the  normalized  root-mean-square 
error(NRMSE)  as 


N  RAISE  = 


\  m 


M, 

E 

^  i—  1 


hi 


0) 


(45) 


where  M  is  the  number  of  Monte  Carlo  runs:  h  is  the  real  channel  and  /?(l)  is  the  estimate 
in  the  /-th  run.  Fig.  5  shows  NRMSE  versus  total  SNR,  where  total  SNR  is  defined  in  (39). 

After  FLC  and  RLC  are  estimated,  a  pic  equalizer  is  built  for  the  FLC  and  a  post-equalizer 
for  the  RLC.  The  length  of  the  zero-forcing  equalizer  is  three  times  the  corresponding  real 
channel  length.  The  equalizer  SNR  is  SNR  at  the  output  of  the  equalized  system.  That  is, 
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for  FLC  SNR  is  that  at  the  output  of  the  channel,  but  for  RLC  SNR.  is  that  at  the  output 
of  the  post-equalig^r.  BER  versus  equalizer  SNR  is  displayed  in  Fig. 6. 

To  evaluate  the  effect  of  iterference  due  to  uplink  data,  Fig. 7  shows  the  BER  versus  SIR 
at  fixed  SNR. 


Fig.  4.  The  channel  impulse  responses  of  FLC  and  RLC. 
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-5  0  5  10  15  20  25  30  35 

Total  SNR 


Fig.  5.  jNRMSE  versus  total  SNR.  without  interference.  100  Monte  Carlo  runs.  500  symbols  for  each  run. 

Simulation  example  II:  Reduced  training  for  a  time  varying  channel: 

We  use  COST-207  Typical  Urban(TU)  [7]  model  with  100  echo  paths,  BPSIv  data  and 
maximum  Doppler  frequency  55Hz.  We  assume  the  channels  to  be  quasistatic,  i.e.  time- 
invariant  in  one  frame  and  time- variant  from  frame  to  frame.  Each  frame  lasts  400/iseconcl 
(cf  557  /./second  for  GSM)  and  the  roundtrip  delay  is  16.67/isecond.  The  receive  filters  for 
FLC  and  RLC  are  raised  cosine  functions  with  roll-off  factors  0.2  and  0.1  respectively.  The 
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Fig.  6.  BER  versus  equalizer  SNR  without  interference.  Channels  are  estimated  at  20dB  with  500  symbols 
in  each  run.  100  Monte  Carlo  runs. 


Fig.  7.  BER  versus  SIR  with  20dB  total  SNR.  and  25dB  equalizer  SNR.  500  symbols  in  each  run.  100  Monte 
Carlo  runs. 

FLC  sustains  a  data  rate  of  1  Mbps,  and  the  RLC  supports  0.667  Mbps. 

Two  situations  are  compared: 

(a)  Training  aided  equalization  of  FLC  at  the  MS  and  of  the  RLC  at  the  BS,  with  no 
feedback. 

(b)  No  training  on  the  RLC,  but  instead  sending  feedback  data  of  the  same  length  as  the 
RLC  training  data  in  (a).  A  precompensator,  obtained  using  the  scheme  of  this  paper  is  used 
on  the  FLC  and  is  augmented  by  a  post-equalizer  estimated  at  the  receiver  using  reduced 
training. 

Both  methods  use  the  same  input  signal  power  and  noise  power.  Figure  8  shows  ///, / n ; 
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versus  input  SNR  for  both  methods  to  achieve  the  same  BER .  Here  nt  is  the  length  of  the  FLG 
training  sequence  used  in  (a)  and  nj,  the  length  of  training  used  for  partially  compensated 
channel  estimation  in  (b),  so  that  the  same  FLC  BER  is  obtained  in  both  cases. 

As  is  evident  from  the  simulation  that  at  18db  SNR  oyr  algorithm  requires  only  3$%  train¬ 
ing  length  on  the  FLC  than  the  conventional  training  based  FLC  compensation.  Translated 
to  a  GSM  setting  instead  of  devoting  l/6th  of  transmission  time  for  training  only  1  /19th  is 
needed  to  achieve  the  same  performance  in  this  time  varying  setting. 

Given  that  the  feedback  data  on  the  RLC  replaces  and  has  the  same  length  as  the  training 
data  in  |a),  this  represents  substantial  savings  in  bandwidth. 


Fig.  8.  n-b/nt  versus  SNR  at  the  same  BER. 


VIE  Conclusion 

We  have  proposed  a  new  feedback  scheme  that  permits  FLC  and  RLC  channel  estimation 
from  the  BS,  in  principle  without  the  use  of  training  signals  or  blind  estimation  methods.  In 
practice  it  leads  to  significant  savings  in  the  bandwidth  consumed  by  training  signals,  and 
the  transfer  of  some  of  the  computational  burden  currently  shouldered  by  the  MS.  to  the 
more  resource  rich  BS.  The  feasibility  of  these  ideas  have- been  demonstrated  through  both 
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theory  and  simulations.  The  possibility  of  unraveling  FLC  and  RLC  from  the  information 
of  RTC  is  a  core  novelty  of  the  paper. 

Appendix 

Similar  to  equation  (3).  we  define 

z.;(n)  =  z(nL  +  L  —  1  —  ?'),  0  <  i  <  L  —  1  (46) 

So  E(z2(n))  =  Ei'o1  E(r.f(n))/L.  And  we  know  Zi(n)  =  E-Iq1  EtLo  .fij{k)xj(n  ~  k) 

M—  I  If  M—l  l.f 

E(~Un))  -  e'  Y.  Y.  fij(k)xj(n  />•;  Y.  Y.  f>p(q)xp(n  -  (i)\  (47) 

j  — 0  k  =  0  p— 0  q— 0 

From  assumption  1.  we  know  E(x-j (n—k)xp(n—q))  =  S(p— j)S(jjji— q)<r2, where  S  is  Kronecker’s 
delta.  Then. 

m- i  h 

E(zf(n))=4  E  E  fiM  H8) 

j— o  k— o 
L  —  I  M — 1  If 

E(z2(n))  -  EE  /3(A)/£  (49) 

«=0  jf=0  fc=0 

From  equation(lO)  and  (8),  we  can  see  -ugifn)  will  pass  through  &',,(.:)  for  @41  0  <  i  <  L  —  1. 
Note  that  the  variance  of  »’i(n)  is  equal  to  »>io(n).  and  w i(n)  and  w-2{n)  are  uncorrelated 
and  zero  mean.  Similar  to  the  derivation  of  E(zf(n))1  we  have 

E(w2(n))  =  alt  Y  t  9i(k)/L  +  <r*2  (50) 

i=0  k= 0 

And  we  know  EfE)1  Ejt?=o 9?(k)  =  Efe=o <72(&)-  From  (49)  and  (50).  we  get  the  expression  for 
SNR  in  (39).  It's  easily  seen  that  the  interference  component  at  the  output  of  RLC  is  the 
signal  by  passing  u(n)  through  channel  G(z).  So  we  have 

Wg(-0)  =  gE.r<»)  (5i) 

*=0 
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ABSTRACT 

This  paper  considers  the  design  of  biorthogonal  DMT  mul¬ 
ticarrier  transceiver  systems  supporting  multiple  services. 
The  supported  user  services  may  have  differing  quality  of 
sendee  (QoS)  requirements,  quantified  in  this  paper  by  bit 
rate  and  symbol  error  rate  specifications.  Our  goal  is  to 
minimize  the  transmitted  power  given  the  QoS  specifica¬ 
tions  for  the  different  users,  subject  to  the  knowledge  of 
colored  interference  at  the  receiver  input  of  the  DMT  sys¬ 
tem.  In  particular  we  find  an  optimum  bit  loading  scheme 
that  distributes  the  bit  rate  transmitted  across  the  various 
subchannels  belonging  to  the  different  users,  and  subject  to 
this  bit  allocation,  determine  an  optimum  transceiver.  This 
work  differs  from  our  prior  work  [6 ]  where  orthonormal 
transceivers  were  considered. 

1.  Introduction 

Future  broadband  communication  systems  will  be  expected 
to  deliver  multiple  services,  such  as  voice,  data,  video,  with 
multiple-stream  support.  Because  delivery  of  these  streams 
will  be  under  differing  requirements  such  as  information 
rate  and  error  performance,  allocation  of  critical  resources 
like  power  would  have  a  significant  impact  on  the  overall 
performance  of  the  communication  system.  Discrete  Mul¬ 
titone  (DMT)  transmission  involves  a  channel  coding  tech¬ 
nique  to  achieve  reliable,  high  data  rate  communications  in 
such  systems.  It  is  a  current  standard  in  various  wireline 
applications  like  ADSL,  VDSL,  [10],  and  in  the  form  of  Or¬ 
thogonal  frequency  division  multiplexing  (OFDM)  has  been 
proposed  for  fixed  wireless  standards  like  IEEE  802.11a, 
[12],  This  paper  considers  transceiver  optimization  for  such 
multicarrier  transmission  systems  operating  in  a  multiuser 
environment. 

More  specifically,  we  assume  that  a  single  DMT  sys¬ 
tem  supports  r  users,  each  having  its  own  QoS  specification 
quantified  by  its  bit  rate  and  symbol  error  rate  (SER).  The 
fc-th  user  requires  a  bit  rate  of  tk,  and  an  SER  of  no  more 
than  r]k .  An  equal  number  of  subchannels  are  assigned  to 
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each  user.  As  proposed  in  several  recent  papers,  [7],  [6], 
[9],  we  consider  general  DMT  transceivers  which  are  more 
general  than  the  traditional  DFT  based  systems  in  that  the 
input  and  output  transforms  are  general  block  transforms. 
We  consider  biorthogonal  systems  employing  zero  padding 
redundancy  with  the  redundancy  removal  at  the  receiver  be¬ 
ing  a  general  linear  operation.  Our  goal  is  to  select  the  input 
and  output  block  transforms  Go,  So  (see  fig.  1),  the  lin¬ 
ear  operation  reflecting  redundancy  removal,  the  number  of 
bits/symbol  assigned  to  each  subchannel,  and  the  subchan¬ 
nels  assigned  to  each  user  to  achieve  the  QoS  specifications, 
under  a  zero  intersymbol  interference  (ISI)  condition  with 
the  minimum  possible  transmitted  power.  We  assume  that 
the  channel  and  equalizer  are  known  and  so  is  the  interfer¬ 
ence  autocorrelation. 

We  thus  generalize  our  earlier  result  reported  in  [6],  where 
the  same  optimization  problem  was  considered  with  an  or¬ 
thonormal  transceiver  under  the  assumption  that  each  user 
is  assigned  the  same  number  of  subchannels.  We  shall  see  in 
the  following  sections  how  the  extension  to  [6],  considered 
here  nontrivially  modifies  the  optimization  problem. 

Figure  1  depicts  the  DMT  communications  system  un¬ 
der  consideration.  An  incoming  data  stream  is  converted 
into  M  parallel  data  streams  of  lower  rate.  An  M-point 
block  transformation  Go,  of  these  streams  of  data  is  fol¬ 
lowed  by  a  parallel-to-serial  conversion,  prior  to  transmis¬ 
sion  through  the  communication  channel.  An  equalizer  is 
employed  to  shorten  the  dispersive  effects  of  the  transmis¬ 
sion  channel.  The  equalized  channel  C(z )  is  assumed  to  be 
FIR  of  length  k.  For  an  FIR  equalized  channel  of  length  k, 
extra  redundancy  of  length  k  in  the  form  of  zero  padding 
is  added  at  the  channel  input  to  infuse  resistance  to  chan¬ 
nel  induced  ISI.  At  the  channel  output,  one  performs  in 
succession  the  operations  of  redundancy  removal,  serial-to- 
parallel  conversion,  and  the  application  of  an  inverse  block 
transform.  So- 

Past  treatment  of  optimum  resource  allocation,  [1],  [2], 
[3],  has  been  restricted  mostly  to  bit  loading  and  power  allo¬ 
cation  algorithms.  Some  authors  have  studied  the  optimum 
transceiver  design  in  the  single  user  case,  [7],  [9].  While 
[7]  was  concerned  with  optimizing  the  transmitted  power, 
[9]  focussed  on  the  maximization  of  the  mutual  informa- 


tion  between  the  transmitted  and  received  signals.  In  [5]  the 
authors  consider  the  problem  considered  here  for  the  sin¬ 
gle  user  case  of  r  =  1,  and  with  orthonormality  condition 
enforced.  In  [7]  the  single  user  case  is  considered  with  or¬ 
thogonality  removed.  Reference  [7]  shows  that  in  the  single 
user  case  biorthogonality  leads  to  no  improvement  in  the 
transmitted  power.  Likewise  a  major  conclusion  of  this  pa¬ 
per  is  to  show  that  even  in  the  multiuser  environment  with 
potentially  asymmetric  subchannel  allocations,  optimal  per¬ 
formance  is  acheived  by  orthonormal  transformations. 


Fig.  1.  DMT  communication  system. 


2.  Formulation 

In  this  Section  we  give  some  preliminaries.  Specifically,  in 
Section  2.1,  we  recount  the  details  of  the  generalized  DMT 
system,  along  the  lines  of  [7].  Section  2.2  provides  a  precise 
description  of  the  optimization  problem. 

2.1.  Preliminaries 

Barring  [7],  most  papers  assume  that  Go  is  unitary,  i.e. 

G*G0  =  I.  (2.1) 

In  the  biorthogonal  case  considered  here  we  relax  (2.1)  and 
simply  assume  that  Go  and  So  can  be  arbitrary  nonsingular 
M  x  M  matrices.  Denote  the  blocks  of  M  input  and  output 
symbols  respectively  by  x(n)  =  [xo(n),  ■  ■  ■ ,  xm- i (n)]T , 
and  x(n)  =  [xo(n),  •  •  •  ,XM-i(n)]T ■  With  v(n ),  the  noise 
and  interference  effect  at  the  output  of  the  equalizer,  denote 
v(n)  =  [v(Nn),  v(Nn  +  1),  ■  ■  ■ ,  v(Nn  +  N  —  1)]T,  as  the 
iV-fold  blocked  version  of  v(n),  with  N  =  M  +  k.  Then 
one  can  show,  [5]  that  with  Cl  an  N  x  M  constant  ma¬ 
trix  characterized  by  the  k  order  FIR  equalized  channel,  and 
Si ,  an  M  x  N  matrix,  representing  the  linear  redundancy 
removal  operation,  the  blocked  input-output  relation  of  the 
system  is  given  by 

x(n)  =  SoSiCLGox(n)  +  SoSiv(n).  (2.2) 


We  impose  the  perfect  reconstruction  (PR)  condition, 
i.e.,  in  the  absence  of  noise/interference,  x(n)  =  x(n)  for 
all  n.  In  other  words. 


S0SiClGo  =  /,  (2.3) 

and  the  DMT  system  has  no  1ST  To  obtain  a  more  useful 
characterization  of  PR,  consider  the  singular  value  decom¬ 
position  of  Cl 


Cl  =  Uc 


Ac 

0 


VCH  =  UoAcVc 


H 


(2.4) 


where  Uc  and  Vc  are  respectively  NxN  and  MxM  unitary 
matrices  and  Ac  is  a  M  x  M  real,  positive  definite  diagonal 
matrix.  Then,  because  of  (2.3),  given  Go,  the  class  of  all 
So  Si  enforcing  PR  is  completely  characterized  by 

So  =G0-\  (2.5) 


and 

Si  =  VcA;1  [Im  A]  uf,  (2.6) 

where  A  is  any  arbitrary  M  x  k  matrix.  In  the  sequel  it  will 
be  useful  to  partition  Uc  as  Uc  =  \Uq  U. l],  where  Uq  is 
N  x  M  and  U\  is  N  x  k. 

Note,  as  Vc,  Uc  and  Ac  are  supplied  by  the  channel, 
the  only  quantities  that  need  to  be  found  to  determine  the 
transceiver  completely  are  Go  and  A. 


2.2.  Problem  formulation 

As  mentioned  earlier  the  M  subchannels  are  distributed  among 
the  r  users  with  each  of  the  users  being  allocated  L  = 
M/r  subchannels.  Thus  consider  disjoint  subsets  I k  C 
{0, . . . ,  M  —  1}  with  \Ik\  =  L  >  1,  and  lk  nlj  =  0,  A;  ^  j. 
Subchannel  assignment  to  the  A;-th  user  constitutes  deter¬ 
mining  I ); .  We  assume  that  the  j-th  subchannel  of  the  fc-th 
user  is  assigned  bjtk  bits  per  symbol.  To  meet  the  bit  rate 
specification  for  the  fc-th  user  one  requires  that 

jy  ^  ^  bj.k  =  tk ■  (2.7) 

j€Xk 


Let  the  input  power  in  the  j-th  subchannel  of  the  fc-th  user 
be  0%.  k .  Assume  that  k  is  the  noise  power  in  this  sub¬ 
channel.  Under  high  SNR  most  modulation  schemes,  [8], 
require  that  to  achieve  a  given  SER  the  required  SNR  is  pro¬ 
portional  to  2bxk.  More  precisely. 


=  dk2b 


(2.8) 


where  the  constant  dk  depends  on  the  desired  SER,  t]k.  for 
the  fc-th  user.  For  example,  for  QAM,  dk  =  | [Q_1(^-)]2- 


Under  this  framework,  the  transmitted  power  for  the  biorthog- 
onal  DMT  system  is  given  by 

r 

Pb  =  (2-9) 

k=  1 jGXfe 
r 

=  T/J2d^bi'k<JGoGo]jj.  (2.10) 

fc=i jeifc 

Define  Rv  denoting  the  known  autocorrelation  matrix  of 
the  noise  vector  v(n),  and 

Re  =  SqRwSq  and  Rw  =  SiRvSf.  (2.11) 

Then  a\  k  are  the  diagonal  elements  of  Re,  the  autocorre¬ 
lation  matrix  of  the  receiver  output  noise  vector  e(n).  Thus, 
because  of  (2.5),  (2.10)  can  be  rewritten  as 


with  equality  iff  for  all  k  and  i,  j 

2bi’k  [S^S^h  [SoRu, S?]jj  =  2bi-k  [5„  HS0  [S0RW S*]n . 

(3.13) 

This  is  in  turn  equivalent  to  the  optimum  bit  loading  rule: 

,,  _  Ntk  ,  [  a^k[GoGo\jj 

j’k  L  °g2[(Uj€lk^jk[G^Go}n)1/L  ' 

(3.14) 

Note  that  Pbopt  is  much  more  complicated  than  its  spe¬ 
cializations,  r  =  1,  studied  in  [5],  Thus,  under  optimum  bit 
loading  the  remaining  variables  must  be  selected  to  mini¬ 
mize  Pbopt  ■  Observe,  that  while  the  choice  of  these  other 
variables  impacts  the  selection  ofbjtk,  Pbopt  itself  is  inde¬ 
pendent  ofbj^k-  This  underscores  the  fact  that  the  remaining 
variables  can  be  selected  regardless  of  the  precise  values  of 
bjtk  obtained  through  (3.14). 


Pb(So)  =  E  E  dk‘2b^[SoHSo1]JJ[SoRwS^]JJ. 

k=l  j€Xk 

(2.12) 

Thus  the  optimization  problem  becomes:  Given  Rv ,  L, 
rjk ,  ffc-  minimize  (2.12)  subject  to  (2.7)  by  selecting  bjtk 
(bit  loading),  selection  of  Tk  (subchannel  assignment).  So 
(transformation  selection)  and  because  of  (2.6),  A  (redun¬ 
dancy  removal  selection). 

We  show  that  there  is  a  conceptual  separation  between 
the  three  selections,  i.e.  the  optimizing  A  is  determined  ex¬ 
clusively  by  Rv,  provided  by  the  knowledge  of  the  interfer¬ 
ence  and  equalizer  characteristics;  S o  is  determined  en¬ 
tirely  by  A  and  the  channel  characteristics;  Ik  are  deter¬ 
mined  entirely  by  Re,  in  turn  provided  by  Si  and  So,  and 
the  bit  allocations  are  determined  once  the  above  quanti¬ 
ties  are  found.  Further  as  noted  in  the  introduction,  we  will 
show  that  without  loss  of  generality,  the  optimizing  So,  G  o 
are  unitary. 


3.  Optimum  selections 

In  this  section  we  consider  the  selection  of  the  various  vari¬ 
ables. 

3.1.  Optimum  Bit  Loading 

From  the  Arithmetic  Mean-Geometric  Mean  (AM-GM)  in¬ 
equality  that  states  that  the  Arithmetic  Mean,  exceeds  the 
Geometric  mean,  with  equality  if  all  samples  are  equal,  we 
have  that  for  a  given  choice  ofXk  and  So,  under  (2.7), 


3.2.  Selection  of  So,  Tk  and  A 

Assume  for  the  moment  that  A  and  hence  Si  has  been  se¬ 
lected  and  that  the  resulting  positive  definite  Hermitian  Rw 
has  the  SVD: 

Rw  =  UA2UH  (3.15) 

with  A  =  {Ao,  •  •  • ,  Am-i}  real,  diagonal  and  U  unitary. 
The  goal  is  to  select  So  and  I/,  to  minimize  Pbopt- 

For  convenience  we  first  work  with  the  minimization  of 

M—l 

J(So)  =  E  ailSoRwS^HS^S^a  (3.16) 

*=  o 

given  positive  a;.  Note  J(So)  has  the  form  of  Pb- 

It  is  noteworthy  that  in  [5],  the  So  =  PUH ,  with  P 
a  permutation  matrix  miminimizes  Pb-opt-  If  So  is  re¬ 
stricted  to  be  unitary,  then  [11]  shows  that  this  choice  of 
So  also  minimizes  (3.16).  Consider,  however,  the  example 
where  Qj  =  1,  M  =  2  and  Rw  =  diag  {9, 1}.  Then  ob¬ 
serve  that  J(I)  =  10  but  with 

1/V3  0  ' 

0  1 

J(C )  =  8.  Thus  in  general  So  =  UH  does  not  minimize 
(3.16).  However,  we  will  show  in  the  sequel  that  it  does 
minimize  Pbopt- 

The  following  result  shows  that  the  search  space  of  So 
can  be  restricted  to  a  particular  form. 

Lemma  3.1  For  some  unitary  V,  (3.16)  is  minimized  by 

So  =  VA~1/2Uh  (3.17) 


C  = 


V2 


1  1 

1  -1 


Pb  >  Pbopt 


—  E d,: 


k= 1 


2Ntk  n  [SfHSf1}n[S0RwSb 

j€lk 


Hi 


in 


l/L 


and  (3.16)  becomes 

M—l 

J(So)  =  E  <*<  [VAVh] l 

i= 0 


(3.18) 


Denote  f3k  =  dk2Nth/L  and  pj  =  [V  AVH]^ .  Then 
under  optimum  bit  loading  it  suffices  to  restrict  the  search 
of  So  to  (3.17)  and  to  seek  to  minimize  under  unitary  V : 

r 

n  =  (3-19) 

k=l  j€Z* 


Then 

L- 1  L- t 

g=  (a  ak.bj)2/L  +  (0  JJ  bt.ai)2/L  <  f. 

k=0,k^i  1=0, l^j 

Further  df  /dai  <  df  /dai+±  and  df  /dbi  <  df  /dbi+ 1 


with  1%  defining  the  optimal  arrangement  of  the  sequence 
of / ik . 

Before  proceeding,  we  need  a  few  results  from  the  the¬ 
ory  of  majorization,  [4], 

Definition  3.1  Consider  two  sequences  x  =  {xi\rf_  [  and 
V  =  {?/*}"=  i  with  Xi  >  Xi- |_i  and  yi  >  Vi+i-  Then  we  say 
that  y  majorizes  x,  denoted  as  x  -<  y,  if 

k  k 

<  Y,yi 

i=  1  i= 1 

holds  for  1  <  k  <  n,  with  equality  at  k  =  n.  We  say  that 
y  weakly  supermajorizes  x,  denoted  x  <w  y,  if  J2i=j  xi  — 
Y!i=j  ViA<j<n. 

Fact  1  If  H  is  an  n  x  n  Hermitian  matrix  with  diagonal 
elements  h  =  {hi}f=1  and  eigenvalues  A  =  {Aj}^,  then 
h  <  A. 


Thus  from  Lemma  3.3,  any  optimum  arrangement  for  (3.19) 
requires  that  for  all 


Pk  >  Pi 


dP%  dP£ 

dpk  dpi ' 


Under  this  condition,  Pf  is  Schur  concave.  Thus  from  Fact 
1,  as 

{ [UAVH]  -<  {A0, . . . ,  AM-t}, 

the  choice  of  V  as  a  permutation  matrix  that  enforces  an  op¬ 
timum  arrangement  of  subchannels,  minimizes  (3.19).  Thus 
to  within  a  permutation  matrix  P,  under  optimum  bit  allo¬ 
cation,  one  can  choose  as  an  optimizing  So  =  PA~1/2UH . 
Now  note  that  for  any  diagonal  nonsingular  matrix  fl. 


j(s0)  =  J(ns0 ), 

and  that  for  some  diagonal  matrix  A, 

So  =  PA~^2UH  =  A~1/2PUh. 


Definition  3.2  A  real  valued  function  <f>(z)  =  <p(zi, . . . ,  zn ) 
defined  on  a  set  A  C  Rn  is  said  to  be  Schur  concave  on  A 

if 

x  ~<y  on  A=>  4>(x)  >  4>(y). 

<ft  is  strictly  Schur  concave  on  A  if  strict  inequality  <ft{x)  > 
(ft (if)  holds  when  x  is  not  a  permutation  of  y.  Further  if 
x  ~<w  y  then  also  <ft(x)  >  (ft(y ). 

We  will  now  state  a  theorem  that  results  in  a  test  for 
strict  Schur  concavity.  We  denote  <ft^  (z)  =  ■ 

Lemma  3.2  Let  <ft(z)  be  a  scalar  real  valued  function  de¬ 
fined  and  continuous  on  T>  =  {(zi, . . . ,  zn)  :  z±  >  > 

zn},  and  twice  differentiable  on  the  interior  of  T>.  Then  (ft {z) 
is  Schur  concave  on  V  if  <ft(k)  (z)  is  increasing  in  k. 


Thus  as  in  [5],  the  minimizing  So  =  PUH  with  P  enforcing 
the  optimum  arrangement.  Under  these  conditions 

[S^So1]^  [S0RwSo]jj 

and  indeed  a2.  k  are  the  eigenvalues  of  Rw . 

Thus,  regardless  of  A  the  best  So  is  a  Karhunen-Loeve 
Transform  of  Rw ,  and  the  a2 .  equal  the  eigenvalues  of  Rw . 
From  the  comment  on  supermajorization  made  at  the  end 
of  Definition  3.2,  it  follows  that  the  optimizing  A  must  be 
such  that  the  set  of  resulting  eigenvalues  of  Rw  weakly  su- 
permajorize  all  possible  sets  of  attainable  eigenvalues.  The 
optimizing  A  can  then  be  shown  to  be  given  by,  [6], 

A  =  -U^RyUi  (U^RyUiY1 .  (3.21) 


The  following  Lemma  provides  an  important  property 
of  the  optimum  arrangement  of  the  subchannels. 

Lemma  3.3  Consider  for  L  >2  and  a ,  /3.  oq,  bj  >  0, 

/  =  («  I]  afe)2/i  +  (£  n  bi?/L 

k= 0  Z=0 

with  a,i  >  aj+i  and  bi  >  6,;+i.  Suppose  for  some  i,  j 

a,i  >  bj  and  (3.20) 

daj  dbi 


4.  Simulation  results 

In  this  section,  we  compare  the  transmitting  power  of  the 
DFT  based  DMT  under  no  bit  allocation  and  optimum  bit 
allocation  with  an  optimum  unitary  transceiver.  We  assume 
the  equalized  channel  to  be  C(z)  =  1  +  0.5^_1,  and  a 
noise  source  v(n )  whose  power  spectral  density  is  shown 
in  fig.  2.  We  assume  the  DMT  system  supports  three  user 
services,  where  each  user  is  allocated  an  equal  number  of 
subchannels  and  with  the  same  SER  requirement  and  mod¬ 
ulation  scheme  for  each  user.  The  plot  in  fig.  2  compares 


the  transmit  power  levels  as  the  number  of  subchannels  al¬ 
lotted  to  the  users  is  varied.  The  plot  shows  that  there  is 
a  10  dB  saving  in  transmit  power  with  our  design  over  the 
DFT  based  DMT  under  optimum  bit  allocation,  and  a  14  dB 
improvement  over  the  conventional  DMT  with  no  optimum 
bit  allocation. 
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Transmitted  power  under  different  DMT  and  bit  allocation  schemes 


Fig.  2.  Comparison  of  transmit  power  levels. 


5.  Conclusions 

In  this  paper,  an  optimum  bit  allocation  strategy  and  design 
of  a  general  biorthogonal  DMT  multicarrier  transceiver  sys¬ 
tem  employing  zero  padding  redundancy  were  presented, 
for  minimizing  the  transmit  power  when  different  users  with 
varied  QoS  requirements  are  supported  and  are  assigned  po¬ 
tentially  different  number  of  subchannels.  We  showed  that 
no  gains  in  transmit  power  can  be  obtained  by  consider¬ 
ing  biorthogonal  transceivers  over  orthogonal  transceivers. 
These  results  also  show  that  the  optimum  transceiver  de¬ 
pends  only  on  the  channel  and  interference  conditions  and 
not  on  the  QoS  requirements.  Indeed  to  within  a  permuta¬ 
tion  of  subchannels,  the  optimum  transceiver  obtained  here 
is  identical  to  that  obtained  in  [5],  Equally  should  the  chan¬ 
nel/interference  remain  invariant  after  the  initial  connection 
is  established,  then  only  bit  loading  and  subchannel  selec¬ 
tion  need  be  updated  in  response  to  changing  traffic  needs. 
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ABSTRACT 

This  paper  considers  the  design  of  biorthogonal  DMT  multicarrier 
transceiver  systems  supporting  multiple  services.  The  supported 
user  services  may  have  differing  quality  of  service  (QoS)  require¬ 
ments,  quantified  in  this  paper  by  bit  rate  and  symbol  error  rate 
specifications.  To  reflect  their  service  priorities,  different  users  on 
the  system  can  be  potentially  assigned  different  number  of  sub¬ 
channels.  Our  goal  is  to  minimize  the  transmitted  power  given  the 
QoS  specifications  for  the  different  users,  subject  to  the  knowledge 
of  colored  interference  at  the  receiver  input  of  the  DMT  system.  In 
particular  we  find  an  optimum  bit  loading  scheme  that  distributes 
the  bit  rate  transmitted  across  the  various  subchannels  belonging 
to  the  different  users,  and  subject  to  this  bit  allocation,  determine 
an  optimum  transceiver. 

1.  INTRODUCTION 

Future  broadband  communication  systems  will  be  expected  to  de¬ 
liver  multiple  services,  such  as  voice,  data,  video,  with  multiple- 
stream  support.  Because  delivery  of  these  streams  will  be  under 
differing  requirements  such  as  information  rate  and  error  perfor¬ 
mance,  allocation  of  critical  resources  like  power  would  have  a 
significant  impact  on  the  overall  capacity  of  the  communication 
system.  Discrete  multitone  (DMT)  is  a  channel  coding  technique 
to  achieve  reliable,  high  data  rate  communications  in  such  systems. 
It  is  a  current  standard  in  various  wireline  applications  like  ADSL, 
VDSL,  [9],  and  in  the  form  of  Orthogonal  frequency  division  mul¬ 
tiplexing  (OFDM)  has  been  proposed  for  fixed  wireless  standards 
like  WLAN's,  [1 1].  This  paper  considers  transceiver  optimization 
for  such  multicarrier  transmission  systems  operating  in  a  multiuser 
environment. 

More  specifically,  we  assume  that  a  single  DMT  system  sup¬ 
ports  r  users,  each  having  its  own  QoS  specification  quantified  by 
its  bit  rate  and  symbol  error  rate  (SER).  The  fc-th  user  is  assumed 
to  have  been  assigned  nt-  subchannels,  and  requires  a  bit  rate  of 
tk.  and  an  SER  of  no  more  than  r]k-  The  number  of  subchannels 
assigned  to  each  user  is  fixed  a  priori  according  to  some  priori¬ 
ties  determined  by  the  user  service.  As  proposed  in  several  recent 
papers,  [4],  [6],  [8],  [10],  we  consider  general  DMT  transceivers 
which  are  more  general  than  the  traditional  DFT  based  systems  in 
that  the  input  and  output  transforms  are  general  block  transforms. 
We  consider  biorthogonal  systems  employing  zero  padding  redun¬ 
dancy  with  the  redundancy  removal  at  the  receiver  being  a  general 
linear  operation.  Our  goal  is  to  select  transforms  Go,  So  (see  fig. 
1),  and  the  linear  operation  reflecting  redundancy  removal,  and 
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assign  bits/symbol  to  each  subchannel,  to  achieve  the  QoS  spec¬ 
ifications,  under  a  zero  intersymbol  interference  (ISI)  condition 
with  the  minimum  possible  transmitted  power.  We  assume  that  the 
channel  and  equalizer  are  known  and  so  is  the  interference  auto¬ 
correlation. 

We  thus  generalize  our  earlier  result  reported  in  [6],  where  the 
same  optimization  problem  was  considered  with  an  orthonormal 
transceiver  under  the  assumption  that  each  user  is  assigned  the 
same  number  of  subchannels.  We  shall  see  in  the  following  sec¬ 
tions  how  the  extension  to  [6]  considered  here  modifies  the  opti¬ 
mization  problem.  Further,  the  asymmetric  subchannel  assignment 
considered  here  is  by  contrast  more  realistic  as  service  priorities 
may  cause  certain  users  to  receive  greater  number  of  subchannels 
than  others.  For  example,  one  may  assign  more  subchannels  to 
video  services  than  to  audio  services. 

Figure  1  depicts  the  broad  contours  of  a  DMT  communications 
system.  An  incoming  data  stream  is  converted  into  M-parallel  data 
streams  of  lower  rate.  An  M-point  block  transformation  of  these 
streams  of  data  is  followed  by  a  parallel-to-serial  conversion,  prior 
to  transmission  through  the  communication  channel.  An  equalizer 
is  employed  to  shorten  the  dispersive  effects  of  the  transmission 
channel.  The  equalized  channel  C(z)  is  assumed  to  be  FIR  of 
length  k.  For  an  FIR  equalized  channel  of  length  k,  extra  redun¬ 
dancy  of  length  k  in  the  form  of  zero  padding  is  added  at  the  chan¬ 
nel  input  to  infuse  resistance  to  channel  induced  ISI.  At  the  channel 
output,  one  performs  in  succession  the  operations  of  redundancy 
removal,  serial-to-parallel  conversion,  and  the  application  of  an  in¬ 
verse  block  transform. 

Past  treatment  of  optimum  resource  allocation,  [1],  [2],  [3], 
has  been  restricted  mostly  to  bit  loading  and  power  allocation  al¬ 
gorithms.  Some  authors  have  studied  the  optimum  transceiver  de¬ 
sign  in  the  single  user  case,  [4],  [8],  [10].  While  [4]  was  concerned 
with  optimizing  the  transmitted  power,  [8]  focussed  on  the  max¬ 
imization  of  the  mutual  information  between  the  transmitted  and 
received  signals.  Essentially,  [4]  considers  the  same  optimization 
as  presented  here  but  under  the  assumption  that  only  a  single  user 
is  supported  by  the  system,  i.e.  r  =  1.  In  comparison  with  [4], 
the  multiuser  environment  considered  in  this  paper  renders  the  op¬ 
timization  problem  highly  non-trivial  as  shall  be  seen. 

In  Section  2,  we  give  a  description  of  the  DMT  system  and 
the  optimization  problem  under  consideration.  Section  3  presents 
our  results  on  optimum  transceiver  selection  and  the  optimum  bit 
rate  allocation  strategy  that  needs  to  be  adopted.  Section  4  presents 
some  simulation  results  showing  improvements  in  transmitted  power 
levels  obtained  with  our  optimum  design.  Section  5  concludes. 


Fig.  1.  DMT  communication  system. 


2.  MULTIUSER  DMT  SYSTEM  MODEL 
2.1.  Preliminaries 

Define  in  fig.  1,  Go  and  So  as  the  nonsingular  M  x  M  trans¬ 
mitter  and  receiver  transform  matrices  respectively.  We  assume  a 
biorthogonal  system,  i.e. 

So  =  Gq1.  (2.1) 

Thus,  the  input  block  transformation  is  a  general  nonsingular  trans¬ 
formation,  and  the  output  transformation  is  its  inverse.  Denote  the 
collection  of  M  input  and  output  symbols  respectively  by  x(n)  = 
[x0(n),  •  ■  ■  ,xM-i(n)]T,  and  x(n)  =  [x0(n),  ■  ■  ■ ,  xM-i  (n)]T. 
With  v(n ),  the  noise  and  interference  effect  at  the  output  of  the 
equalizer,  denote  v(n)  =  [v(Nn),  v(Nn+  1),  ■  ■  ■ ,  v(Nn  +  N  — 
1)]T,  as  the  iV-fold  blocked  version  of  v(n),  with  N  —  M  +  k. 
Then  the  system  in  fig.  1  has  the  equivalent  description  of  fig.  2. 
Here  C(z)  is  the  iV-fold  blocked  version  of  the  channel-equalizer 
combination  C(z)  —  Co  +  Ciz-1  +  . . .  +  cK z~K .  One  can  show 
that 


C (z)  =  [  CL  Cr(*)  ]  (2.2) 

where  Cl  is  an  N  x  M  constant  matrix,  and  Cr(z)  is  an  N  x  n 
matrix. 

The  addition  of  zero  padding  redundancy  to  an  M-block  vec¬ 
tor  is  equivalent  to  premultiplication  by 


Zzv 


1m 

0  kXM 


(2.3) 


Denote  by  Si .  a  suitable  M  x  N  matrix,  representing  the  linear 
redundancy  removal  operation.  Then  the  input-output  relation  of 
the  system  in  fig.  2  is  given  by 


x(n)  =  SoSiC(z)ZzrGox(n)  +  SoSiv(n)  (2.4) 

=  SoSiCLGox(n)  +  SoSiv(n).  (2.5) 

We  impose  the  perfect  reconstruction  (PR)  condition,  i.e.,  in 

the  absence  of  noise/interference,  x(n)  =  x(n)  for  all  n.  In  other 
words, 

SoSiClGo=T,  (2.6) 

and  the  DMT  system  has  no  1ST  To  obtain  a  more  useful  charac¬ 
terization  of  PR,  consider  the  singular  value  decomposition  of  Cl 
defined  in  (2.2): 


Cl  =  Uc 


Ac 

0 


VCH  =  U0AcVcH 


(2.7) 


Fig.  2.  Block  representation  of  DMT  communications  system. 


where  Uc  and  Vc  are  respectively  NxN  and  M  x  M  unitary  matri¬ 
ces  whose  columns  are  the  eigenvectors  of  Cl  Cl  H  and  Cl  H  Cl  • 
Ac  is  the  M  x  M  real,  positive  definite  diagonal  matrix  with  diag¬ 
onal  elements  that  are  the  singular  values  of  Cl.  Then,  because  of 
(2.6),  given  Go,  the  class  of  all  SoSi  enforcing  PR  is  completely 
characterized  by  (2.1)  and 


5i=ycAc_1[/M  A]u?,  (2.8) 

where  A  is  any  arbitrary  M  x  k  matrix.  In  the  sequel  it  will  be 
useful  to  partition  Uc  as  Uc  —  [Go  Ui\,  where  Go  is  N  x  M  and 
Gi  is  N  x  k. 

Note,  as  Vc,  Uc  and  Ac  are  supplied  by  the  channel,  the  only 
quantities  that  need  to  be  found  to  determine  the  transceiver  com¬ 
pletely  are  Go  and  A. 

2.2.  Problem  formulation 

As  mentioned  earlier  the  M  subchannels  are  distributed  among 
the  r  users  with  the  fc-th  user  allocated  n*  subchannels.  Thus  con¬ 
sider  disjoint  subsets  Zk  C  {0, .  . . ,  M  —  1}  with  \Zk\  —  nk,  and 
Zk  fl  Zj  =  0,fc  ^  j.  Subchannel  assignment  ro  the  k- th  user 
constitutes  determining  Zk.  We  assume  that  the  j- th  subchannel  of 
the  k- th  user  is  assigned  bjtk  bits  per  symbol.  To  meet  the  bit  rate 
specification  for  the  fe-th  user  one  requires  that 

jy  ^  ^  bj.k  —  tk ■  (2.9) 


Let  the  input  power  in  the  ,7-th  subchannel  of  the  k- th  user  be 
0%.  h .  Assume  that  h  is  the  noise  power  in  this  subchannel. 
Under  high  SNR  most  modulation  schemes,  [7],  require  that  to 
achieve  a  given  SER  the  required  SNR  is  proportional  to  2bj<k . 
More  precisely, 

a2Xjk=dk2bi-ka2ejk,  (2.10) 

where  the  constant  dk  depends  on  the  desired  SER  for  the  fc-th 
user,  r\k.  For  example,  for  QAM,  dk  =  |[Q-1('x)]2-  Under 
this  framework,  the  transmitted  power  for  the  biorthogonal  DMT 
system  is  given  by 

r 

pb  =  Y.  Y.  (2-u) 

fc=i  jeik 

r 

=  Y  Y  ^26-*at2.  „[G0"G0]3,.  (2.12) 

*=1  jeik 


The  minimization  of  the  transmission  power  involves  optimal 
selection  of  bj,k  (bit  loading),  selection  of  2*.  (subchannel  assign¬ 
ment).  Go  (transformation  selection)  and  A  (redundancy  removal 
selection). 


3.  OPTIMUM  SELECTIONS 

With  Re ,  Ru>  and  Rv  denoting  the  autocorrelation  matrices  of  the 
noise  vectors  e(n),  w(n)  and  v(n)  respectively, 

Re  =  SoRwSo  and  Rw  =  SiRvS? .  (3.13) 

Note  that  o'?,.  k  in  (2.12)  are  the  diagonal  elements  of  Re.  Thus 
(2.12)  can  be  rewritten  as 

r 

Pb{S0)  =  EE  dk2bj  k  [S0“H5o  ]W.  (3.14) 

k= i  jeik 

First  consider  the  problem  of  determining  optimum  Go.  or 
equivalently  chossing  So-  Observe  that  the  problem  of  choosing 
So-  under  (2.1),  minimizing  (3.14)  has  the  following  form: 

Problem  3.1  With  M  x  M  positive  definite  Hermitian  Rw  and 
M  x  M  nonsingular  So,  oti  >  0,  determine  So  to  minimize 

M- 1 

J(S0)  =  a^oi^S^MSo'^o'1^  (3.15) 

i=0 

Now  note  that  for  any  diagonal  nonsingular  matrix  fi,  J(So)  = 
J(fl5o).  This  means  that  in  finding  a  minimizing  So  to  (3.15), 
one  can  restrict  the  search  to  an  So  for  which  the  diagonal  ele¬ 
ments  of  SqH Sq1  are  1  /oti.  The  following  Lemma  considers  the 
equivalent  constrained  optimization  of  Problem  3.1. 

Lemma  3.1  With  M  x  M  positive  definite  Hermitian  matrix  Rw, 
M  x  M  nonsingular  So,  consider  the  minimization  of 

M- 1 

YlSoR™Sg]ii  (3.16) 

i= 0 

such  that  for  all  i  £  {0, . . . ,  M  —  1}  and  some  ai  >  0, 

[So^So1]*:  =  !/«<■  (3.17) 

Then  the  minimizing  So  obeys  for  some  real,  positive  definite,  di¬ 
agonal  r, 

(SoRwSq  )(SoSo  )  =  T.  (3.18) 

In  fact  one  minimizing  So  for  (3.15)  obeys 

(S0RwsZ)(SoSZ)  =  I. 

The  following  result  shows  a  particular  So  minimizing  (3.15). 

Lemma  3.2  Let  the  SVD  of  positive  definite  Hermitian  Rw  be 

Rw  =  UA2Uh  (3.19) 

with  A  real,  diagonal  and  U  unitary.  Then  for  some  unitary  V, 
(3.15)  is  minimized  by 

So  =  VA~1/2Uh  (3.20) 

and  (3.15)  becomes 

M-l 

J(So)  =  £>  iVAV% 

i=0 


Before  proceeding,  we  need  a  few  results  front  the  theory  of 
majorization,  [5]. 

Definition  3.1  Consider  two  sequences  x  =  {xi}'i=1  and  y  = 
{l/iliLi  wtih  Xi  >  Xi+i  and  t/j  >  t/j+i.  Then  we  say  that  y 
majorizes  x,  denoted  as  x  -<  y,  if  Xi  <  y^*_,  yi  holds 

for  1  <  k  <  n,  with  equality  at  k  —  n.  We  say  that  y  weakly 
supermajorizes  x,  denoted  x  -<w  y,  if^n=.  Xi  >  y~]n_ .  yi,  1  < 
J  <n. 

Fact  1  If  H  is  ann  x  n  Hermitian  matrix  with  diagonal  elements 
h  =  {/tj}f=1  and  eigenvalues  A  =  {A »}?=1,  then  h  -<  A. 

Definition  3.2  A  real  valued  function  =  4>(zi, . . .  ,  zn )  de¬ 

fined  on  a  set  A  C  Rn  is  said  to  be  Schur  concave  on  A  if 
x  -<  y  on  A  =>  fiix)  >  fiiv).  is  strictly  Schur  concave  on 
A  if  strict  inequality  </>(x)  >  4>(y)  holds  when  x  is  not  a  permuta¬ 
tion  ofy.  Further  if  x  -<w  y  then  also  (f>(x)  >  <t>(y). 

We  will  now  state  a  theorem  that  results  in  a  test  for  strict 
Schur  concavity.  We  denote  fi^(z)  =  ■ 

Lemma  3.3  Let  fi(z)  be  a  scalar  real  valued  function  defined  and 
continuous  on  V  —  {(zi,  ■  ■  ■  ,zn)  :  zi  >  ...  >  zn },  and  twice 
differentiable  on  the  interior  ofD.  Then  <f>(z )  is  Schur  concave  on 
T>  iffi^fiz)  is  increasing  in  k. 

We  now  return  to  the  minimization  of  (3.14)  with  Rw  having 
the  form  (3.19).  Because  of  Lemma  3.2,  with  So  = 
for  any  choice  of  Tk  and  bj,k,  and  subject  to  V  being  unitary 

r 

PB  >  mm  YdhYl  2bi'h  ( [' VAyH ]  jj  )2 

k=i  jeik 

r 

>  min  V dk2Ntk/nk  TT  (\VAVH]  ,.)2/n* 

yyH  —j  /  j  11  L 

k= i  jeik 

Denote  =  dk2Ntk/nk  and  aj  -  [VAVH]  .  Then 

r 

Pb  >  Pb  -  min  ^  TT  (3.22) 

k=i  feZfc 

with  Tk  defining  the  optimal  arrangement  of  the  sequence  of  ak. 
The  following  Lemma  characterizes  such  optimum  arrangements. 

Lemma  3.4  Consider  for  integers  p,  q  >  2. 

f  =  (a\[ak)2/p  +  (p\[bl)2'<l 
k= 0  1=0 

with  a,  /3,  ai,bj  >  0.  Suppose  for  some  i,  j 

ai  >  bj  and  (3.23) 

oai  Wj 

Then  g  =  (a  UHl^i  ak-bj)1/p  +  (P  YllZo^j  bt-at)1/q  <  /• 


(3.21) 
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Hence  if  there  exist  pairs  ai,bj  such  that  (3.23)  holds,  violating 
the  Schur  concavity  condition  in  Lemma  3.3,  the  value  of  /  can  be 
reduced  by  interchanging  a,i,bj.  From  Lemma  3.4,  an  optimum 

dP*  dP* 

arrangement  for  (3.22)  requires  that  for  all  afc  >  a;,  . 

Under  this  condition,  Pg  is  Schur  concave.  Thus  from  Fact  1, 
as  {[VAV11]..}™-1  -<  {A0, . . . ,  Am-i},  the  choice  V  =  / 

minimizes  (3.22).  Thus  the  optimizing  So  =  A~1/'2UH,  under 
optimum  bit  allocation. 

The  optimum  bit  loading  is  obtained  by  using  the  fact  that  the 
arithmetic-mean  of  the  nk  numbers  { 2bj  ka 2.  k[Go  Go\jj}jeik 
in  (2.12)  is  greater  than  their  geometric  mean,  with  equality  iff 

u  _  Ntk  ,  I"  aej  ,*  [GoGo]jj 

-  nk  0g2  [(njeXfc  [Gq  Go\jj)1/nk  _  ■ 

Thus  under  the  above  optimum  bit  allocation,  and  because  of  (3.13) 
and  Fact  1,  the  optimum  Go  (from  (2.1))  is  a  matrix  of  eigenvec¬ 
tors  of  Rw,  and  [Go  Go\jj  [Gg  1  RwGg  H]jj  are  the  eigenvalues  of 
Rw .  Thus  even  under  the  relaxed  conditions  on  Go,  the  best  power 
is  achieved  with  orthogonal  Go-  Further,  the  optimizing  A  must  be 
such  that  the  set  of  resulting  eigenvalues  of  Rw  weakly  superma- 
jorize  all  possible  sets  of  attainable  eigenvalues.  The  optimizing  A 
can  then  be  shown  to  be  given  by,  [6], 

A  =  -UqRvUi  (ufR-oUi)-1 .  (3.24) 

4.  SIMULATION  RESULTS 

In  this  section,  we  compare  the  transmitting  power  of  the  DFT 
based  DMT  under  no  bit  allocation  and  optimum  bit  allocation 
with  an  optimum  unitary  transceiver.  We  assume  the  equalized 
channel  to  be  C(z)  =  1  +  0.5^_1 .  and  a  noise  source  v(n)  whose 
power  spectral  density  is  shown  in  fig.  3.  We  assume  the  DMT  sys¬ 
tem  supports  two  user  services.  The  ( i ,  j)  on  the  x-axis  of  the  plot 
indicates  that  user  1  and  2  were  respectively  allocated  i,  j  num¬ 
ber  of  channels.  The  plot  shows  that  there  is  a  10  dB  saving  in 
transmit  power  with  our  design  over  the  DFT  based  DMT  under 
optimum  bit  allocation,  and  a  14  dB  improvement  over  the  con¬ 
ventional  DMT  with  no  optimum  bit  allocation. 

5.  CONCLUSIONS 

In  this  paper,  an  optimum  bit  allocation  strategy  and  design  of 
a  general  biorthogonal  DMT  multicarrier  transceiver  system  em¬ 
ploying  zero  padding  redundancy  were  presented,  for  minimizing 
the  transmit  power  when  different  users  with  varied  QoS  require¬ 
ments  are  supported  and  are  assigned  potentially  different  number 
of  subchannels.  We  showed  that  no  gains  in  transmit  power  can  be 
obtained  by  considering  biorthogonal  transceivers  -  the  optimum 
power  is  achieved  through  an  orthogonal  transceiver.  Simulations 
demonstrated  potential  improvements  in  performance  over  DFT 
based  DMT  systems. 
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Abstract 

This  paper  considers  the  design  of  biorthogonal  DMT  multicarrier  transceiver 
systems  supporting  multiple  services.  The  supported  user  services  may  have 
differing  quality  of  service  (QoS)  requirements,  quantified  in  this  paper  by  bit 
rate  and  symbol  error  rate  specifications.  To  reflect  their  service  priorities, 
different  users  on  the  system  can  be  potentially  assigned  different  number  of 
subchannels.  Our  goal  is  to  minimize  the  transmitted  power  given  the  QoS 
specifications  for  the  different  users,  subject  to  the  knowledge  of  colored  in¬ 
terference  at  the  receiver  input  of  the  DMT  system.  In  particular  we  find  an 
optimum  bit  loading  scheme  that  distributes  the  bit  rate  transmitted  across 
the  various  subchannels  belonging  to  the  different  users,  and  subject  to  this  bit 
allocation,  determine  an  optimum  transceiver. 
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1  Introduction 


Discrete  multi-tone  (DMT)  modulation  has  proved  to  be  an  effective  solution  to  the 
problem  of  reliable  and  efficient  data  transmission  over  frequency  selective  communi¬ 
cation  channels.  It  is  a  curent  standard  in  various  wireline  applications  like  ADSL, 
VDSL,  [2],  and  in  the  form  of  Orthogonal  frequency  division  multiplexing  (OFDM) 
has  been  proposed  for  fixed  wireless  standards  like  IEEE  802.11a.  With  growing 
and  changing  user  needs,  such  communication  systems  are  expected  to  deliver  mul¬ 
tiple  services,  such  as  voice,  data  and  video,  with  multiple  stream  support.  Because 
delivery  of  these  streams  can  be  under  different  requirements  on  parameters  like  er¬ 
ror  performance,  appropriate  allocation  of  bandwidth  and  rates  among  the  various 
services  becomes  an  important  problem.  The  subject  of  this  paper  is  to  consider 
transceiver  optimization  of  general  biorthogonal  DMT  systems  operating  in  such  a 
multi-user  environment. 

We  consider  multiple  flows  supported  on  a  general  biorthogonal  DMT  system.  Fig. 
1  depicts  the  general  DMT  communication  system  under  consideration.  This  system  is 
more  general  as  compared  to  the  conventional  DMT  systems  in  that  the  transmitter 
and  receiver  transforms  are  characterized  by  square  matrices  G0  and  S0,  and  the 
redundancy  removal  at  the  receiver  is  a  general  linear  transformation  SQ  In  contrast, 
conventional  DMT  systems  employ  an  Inverse  discrete  Fourier  transform  (IDFT)  and 
DFT  at  the  transmitter  and  receiver  ends  respectively,  and  the  redundancy  removal 
is  done  by  simply  discarding  certain  symbols.  Also,  we  consider  zero  padding  as  the 
form  of  redundancy  injection,  as  proposed  in  several  recent  papers,  [7],  [10],  [13].  We 
assume  that  this  DMT  system  supports  r  service  flows.  Each  flow  may  have  its  own 
quality  of  service  (QoS)  requirement  quantified  in  this  paper  by  its  bit  rate  and  symbol 
error  rate  (SER).  Further,  depending  on  their  respective  service  priorities,  each  flow 
is  assumed  to  have  been  a  priori  allotted  a  certain  number  of  subchannels.  Thus  for 
instance,  large  bandwidth  consuming  video  flows  may  receive  more  subchannels  than 
voice  or  data  flows.  We  thus  assume  that  the  /c-th  flow  is  assigned  nk  subchannels, 
requires  a  bit  rate  of  tk  and  an  SER  of  no  more  than  gk.  We  desire  to  minimize 
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the  total  transmitted  power  given  these  service-flow  QoS  specifications  so  that  inter 
symbol  interference  (ISI)  free  transmission  occurs.  The  goal  is  to  select  the  input  and 
output  block  transforms  G0  and  S0,  the  linear  redundancy  matrix  Si,  the  number 
of  bits/symbol  assigned  to  each  subchannel,  and  the  subchannel  assignment  to  each 
service  flow,  in  order  to  achieve  the  QoS  specifications  under  a  zero  ISI  condition  with 
minimum  transmitted  power.  We  assume  knowledge  of  the  equalized  channel  and  the 
second-order  statistics  of  the  noise  at  the  receiver  input. 

We  thus  extend  in  this  paper  our  results  reported  in  [11],  where  the  same  optimiza¬ 
tion  was  considered  with  an  orthonormal  transceiver  under  the  assumption  that  each 
flow  is  assigned  the  same  number  of  subchannels.  The  unequal  subchannel  assignment 
considered  in  this  work  is  more  realistic  as  service  priorities  may  cause  certain  flows 
to  be  assigned  greater  number  of  subchannels  than  others.  We  shall  find  that  relaxing 
the  orthonormality  condition,  that  is  considering  a  biorthogonal  system,  and  having 
unequal  subchannel  assignment  renders  the  underlying  optimization  methodology  to 
be  non-trivially  different  from  the  considerations  in  [11], 

Related  treatments  in  literature  are  [3],  [5],  [6],  [7],  [8],  [13],  [16],  [17].  A  great 
amount  of  work  has  been  done  in  dealing  with  the  problem  of  bit  loading  and  power 
allocation.  References  [3],  [17]  provide  algorithms  for  the  bit  loading  problem  while 
[5],  [6]  treat  the  power  allocation  problem  for  the  multi-user  and  single  user  cases  re¬ 
spectively.  The  problem  of  minimizing  the  overall  transmit  power  with  multiple  users 
on  an  OFDM  system  was  considered  in  [17],  where  the  minimization  was  done  by 
adaptively  assigning  subcarriers  to  the  various  users  along  with  adjusting  the  number 
of  bits  and  user  power  levels.  The  problem  of  transceiver  optimization  has  received 
attention  only  recently,  with  works  [7],  [8],  [13]  dealing  with  optimum  designs  when 
a  single  user  is  supported  on  the  DMT  system.  While  [13]  treated  the  problem  of 
maximization  of  mutual  information  between  transmitted  and  received  signals,  works 
[7]  and  [8]  considered  the  problem  of  minimizing  transmit  power.  In  particular,  [7] 
considered  the  design  of  an  optimum  orthonormal  DMT  system  that  achieves  mini¬ 
mum  transmit  power  under  certain  specified  error  probability  and  rate  requirement. 
This  work  was  extended  in  [8]  with  the  orthonormality  constraint  relaxed.  In  both 
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[7],  [8]  however  it  was  assumed  that  a  single  user  is  supported  on  the  DMT  system. 

The  problem  we  consider  in  this  paper  differs  from  the  treatments  [7],  [8],  [17] 
in  the  following  ways.  As  in  these  works,  we  consider  the  problem  of  minimizing 
transmit  power.  However  in  contrast  to  [17],  which  restricts  its  treatment  to  the 
problem  of  resource  allocation  to  conventional  fixed  DFT  based  systems,  we  consider 
the  problem  of  optimum  transceiver  design  of  a  general  biorthogonal  system  as  well. 
Further  the  multiuser  environment  in  our  paper  makes  our  treatment  substantively 
different  from  the  single  user  case  considered  in  [7],  [8]. 

Our  approach  in  this  paper  follows  closely  up  on  the  formalisms  developed  in  [8]. 
The  bit  loading  solution  of  [8]  can  be  considered  as  a  water-pouring  approach.  Sup¬ 
pose  a  set  of  subchannels  experience  a  high  level  of  attenuation.  Then  the  optimum  bit 
loading  scheme  assigns  fewer  bits/symbol  on  these  subchannels.  Further  to  avoid  too 
many  of  such  low  performing  subchannels,  the  subchannel  selection  process  (optimum 
transform  selection)  has  to  squeeze  out  those  frequency  bands  with  adverse  conditions 
as  best  as  possible,  specifically  by  forcing  channel  nulls  or  noise/interference  peaks 
to  occupy  as  few  subchannels  as  possible.  These  notions  are  developed  for  the  mul¬ 
tiuser  case  in  our  paper.  A  major  conclusion  of  our  work  will  be  to  show  that,  as 
in  the  single  user  case  of  [8],  even  in  the  multiuser  case  with  asymmetric  subchan¬ 
nel  assignments,  optimal  performance  is  achieved  through  orthonormal  transforms. 
Specifically,  we  show  that  the  optimum  receiver  transform  is  one  which  diagonalizes 
the  autocorrelation  matrix  of  a  certain  noise  vector  at  the  receiver  input,  and  the 
optimum  transmit  transform  is  then  the  inverse  of  this  optimum  receiver  transform 
matrix. 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section  2,  the  general 
biorthogonal  DMT  system  is  described,  and  the  characterization  of  such  a  system 
is  developed  along  the  lines  of  [8].  The  precise  optimization  problem  of  minimizing 
transmit  power  is  formulated  in  Section  3.  The  optimal  selections  of  the  optimization 
variables  are  described  in  Section  4.  Section  5  provides  some  simulations  and  Section  6 
concludes.  The  Appendix  provides  some  useful  results  from  the  theory  of  majorization 
employed  to  solve  the  optimization  problem  at  hand. 
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Figure  1:  DMT  communication  system. 

The  following  notation  will  be  followed  in  the  subsequent  Sections.  Discrete 
time  signals  will  be  denoted  by  x(n),y(n)  etc.,  with  n  being  the  time  index.  An 
M  x  M  diagonal  square  matrix  with  diagonal  elements  o£i,...,o£m  is  written  as 
diag{di, . . . ,  o?m}-  [Mij  denotes  the  element  in  the  Tth  row,  j-th  coloumn  of  ma¬ 
trix  A.  AT ,  AH  respectively  denote  the  transpose  and  transpose-conjugate  of  matrix 
A.  An  M  x  M  identity  matrix  will  be  denoted  by  I m-,  with  the  subscript  usually 
dropped  when  the  dimensions  are  clear  from  context. 

2  General  DMT  system 

In  this  Section,  we  describe  the  general  biorthogonal  DMT  system  supporting  the  var¬ 
ious  user  flows.  Consider  the  biorthogonal  general  DMT  system  shown  in  fig.  1.  In  a 
biorthogonal  DMT  system,  the  transmit  and  receive  matrices  G0  and  S0  are  arbitrary 
nonsingular  M  xM  matrices,  in  contrast  to  orthonormal  systems,  [7],  where  G0 ,  S0  are 
unitary,  i.e.  GqG0  =  I  =  Sq  S0.  In  conventional  DMT  systems,  G0  is  an  IDFT  and 
S0  is  the  DFT  matrix.  We  assume  that  the  data  streams  x0(n),  Xi(n), . . . 1(^) 
result  from  the  r  service  flows.  The  M  x  M  block  transformation  G0  is  applied  to  the 
M-symbol  data  stream  [x0(n),xi(n), . . . i(n)]T,  and  redundancy  in  the  form  of 
padded  zeros  is  introduced  followed  by  parallel-to-serial  conversion,  prior  to  transmis- 
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sion.  Thus  consider  the  M  symbol  vector  y (n)  =  [yo(n),  yi(n), . . . ,  ?/m- i(n)]T.  Each 
such  block  is  converted  in  to  an  iV-block,  with  N  =  M  +  ac,  by  simply  appending  ac 
zeros  to  it,  obtaining  the  block 

s (n)  =  [s0(n),si(n), . . .  ,Siv-i(n)]T 

=  boH,  Vi(n), j/m-iW,  0, . . . ,  0]T. 

Thus  with 

Xzv  =  lM  ,  (2.1) 

0  kxM 

we  have 

s  (n)  =  Xzvy{n). 

Most  practical  transmission  channels  are  characterized  by  dispersion  that  spreads  over 
a  very  large  number  of  samples.  A  time  domain  equalizer  is  employed  to  limit  these 
dispersive  effects  and  contain  most  of  its  energy  in  a  few  samples.  Thus  the  combina¬ 
tion  of  the  transmission  channel  and  equalizer,  given  by  the  equalized-channel  C(z),  is 
assumed  to  be  FIR  of  length  ac.  For  such  a  ac  length  FIR  equalized-channel,  addition  of 
zero  padding  redundancy  of  length  At  infuses  resistance  to  channel  induced  ISI.  Denote 
the  multiple-input  multiple-output  system  relating  s (n)  =  [so(n)>  Si(rc),  ■  •  • ,  Siv-i(^)]T 
to  s (n)  by  the  N  x  N  channel  matrix  C (z).  This  matrix  is  obtained  from  the  coeffi¬ 
cients  of  the  equalized-channel  C(z )  =  c0  +  C\Z~X  +  . . .  +  cKz~K,  and  is  given  by 

C0  Z~1CN- 1  Z~XCN- 2  •  •  •  Z~XC  1 

Cl  Co  Z^Cn- !  •  •  •  Z~1c2 

Cat-i  Cat-2  •  •  •  Ci  c0 

with  c*  =  0,  for  i  >  ac.  At  the  equalized-channel  output,  the  redundancy  is  re¬ 
moved  using  an  M  x  N  linear  transform  matrix  Si.  The  M-block  output  samples 
Xo(n),Xi(n), . . .  jXm- i(n)  are  then  obtained  by  the  application  of  the  M  x  M  trans¬ 
formation  S0. 

Denote  the  blocks  of  M  input  and  output  symbols  by  x(n)  =  [x0(n),  •  •  • ,  xm- i(^)]T 
and  x(n)  =  [x0  (n),  •  •  • ,  respectively.  With  v(n),  the  noise  and  interference 


6 


v(n) 


Figure  2:  Block  model  of  DMT  system. 


effect  at  the  output  of  the  equalizer,  denote  v(n)  =  [v(Nn),  v(Nn  +  1),  •  •  • ,  v(Nn  + 
N  —  1)]T,  as  the  iV-fold  blocked  version  of  v(n).  We  thus  can  redraw  fig.  1  in  the 
equivalent  form  shown  in  fig.  2.  It  is  easy  to  see  that  C (z)ZZv  =  Cl,  an  TV  x  M 
constant  matrix.  The  output  symbol  vector  x(n)  is  then  given  by 

x(n)  =  lS'05'iCLGox(n)  +  S'05'iv(n).  (2.3) 

We  impose  the  perfect  reconstruction  (PR)  condition,  i.e.,  in  the  absence  of 
noise/interference,  x(n)  =  x(n)  for  all  n.  In  other  words, 

S0SiClGo  =  /,  (2.4) 

and  the  DMT  system  has  no  ISI.  To  obtain  a  more  useful  characterization  of  PR, 
consider  the  singular  value  decomposition  of  Cl 

CL  =  Uc  K  VCH  =  U0  ACVCH  (2.5) 

0 

where  Uc  and  Vc  are  respectively  TV  x  N  and  M  x  M  unitary  matrices,  Ac  is  a  M  x  M 
real,  positive  definite  diagonal  matrix,  and  Uc  is  partitioned  as  Uc  —  [U0  b\].  where 

U0  is  N  x  M  and  U\  is  N  x  k.  Then,  because  of  (2.4),  given  G0,  the  class  of  all  SqS\ 
enforcing  PR  is  completely  characterized  by 

So  =  Go1.  (2.6) 

and 

Si  =  UAp[/M  A]U?,  (2.7) 
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where  A  is  any  arbitrary  M  x  k  matrix.  Note  that  as  Vc,  Uc  and  Ac  are  supplied  by 
the  channel,  the  only  quantities  that  need  to  be  found  to  determine  the  transceiver 
completely  are  G0  and  A. 


3  Problem  formulation 


As  mentioned  earlier  the  M  subchannels  are  distributed  among  the  r  users  with  the 
/c-th  user  allocated  n k  subchannels.  Assume  that  the  n k  indices  of  the  subchannels 
assigned  to  the  A;-th  user  are  contained  in  the  set  Xk.  Thus  consider  disjoint  subsets 
Xk  C  {0,  ...,M  —  1}  with  \Xk\  =  nk  >  1,  and  Xk  n  Xj  =  0,  k  ^  j.  Subchannel 
assignment  to  the  k- th  user  then  constitutes  determining  Xk.  We  assume  that  the 
j-th  subchannel  of  the  A;-th  user  is  assigned  bj,k  bits  per  symbol.  To  meet  the  bit  rate 
specification  for  the  k- th  user  one  requires  that,  for  each  1  <  k  <  r, 


Y  bi,k  =  4-  (3.8) 

ieXfc 


We  assume  that  each  service  flow  employs  a  different  modulation  scheme  and  has 
to  meet  a  certain  SER.  Most  6-bit  symbol  constellation  schemes  require  an  output 
signal-to-noise  ratio  (SNR)  of  d2^b,  b  large,  in  order  to  achieve  a  given  SER,  say 
r].  Here  d  >  0  is  determined  by  SER  r]  and  the  employed  modulation  scheme  and 
constant  £  >  0  depends  on  the  particular  modulation  scheme  used.  For  example,  for 
a  6-bit  square  QAM,  the  SER  is  given  by 


Tj  =  4 


(  I  3SNR  \ 


4Q 


/3SNR\ 

J  ’ 


when  2b  1, 


(3.9) 


where 

roc  1  9  . 

Q(a )  =  /  — e~x  ^'2dx. 

Ja  V2vr 

Thus  for  large  6,  SNR=  d2^b  with  £  =  1,  d  =  |[Q_1(^)]2  >  0.  In  the  case  of  PAM, 

£  =  2,  d=|[Q-1(|)]2>0. 

Let  the  input  power  in  the  j- th  subchannel  of  the  /c-th  user  be  a2.  k,  and  a2  be 
the  output  noise  variance  in  this  subchannel.  Since  under  perfect  reconstruction,  the 
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input  power  equals  the  output  power  in  that  subchannel,  the  relation  between  the 
input  signal  power  and  the  output  noise  power  are  related  by 


ik2ik'’‘* 


(3.10) 


where  the  constant  d *  is  determined  by  the  modulation  scheme  used  and  the  desired 
SER,  r]k ,  for  the  /c-th  user,  and  Ck  depends  on  the  modulation  scheme  employed  by 
the  k- th  user.  Here  we  have  assumed  that  each  of  the  subchannels  of  a  particular 
user  has  the  same  error  rate.  The  error  rates  can  however  vary  across  different  users. 

The  total  transmitted  power  of  the  DMT  system  can  be  written  as 

N- 1 

Pb=Y,< 

1=0 

where  a2.  is  the  variance  of  the  stream  sfin)  in  fig.  1.  Rewritting  in  terms  of  G0  and 
a 2  , ,  we  have 

Pb  =  Y.H  aljik[GoGo]jj  (3-11) 

k=i jeik 

Denote  Rv  to  be  the  known  autocorrelation  matrix  of  the  noise  vector  v(n),  and 
e(n),  w(n)  be  the  noise  vectors  at  the  output  of  S0,  S i  respectively  in  fig.  2,  with 
respective  autocorrelation  matrices  Re,  Rw.  We  then  have  the  relations 


Re  =  S0RwSq  and  Rw  =  SiRvS^ .  (3-12) 

Note  that  a2  are  the  diagonal  elements  of  Re.  Thus,  because  of  (2.6),  (3.10)  and 
(3.12),  expression  (3.11)  can  be  rewritten  as 

PS  =  EE  S^USAS^.  (3.13) 

k=ijeik 

Thus  the  precise  optimization  problem  can  be  stated  as  follows. 


Problem  3.1  Given  positive  Uk,  4,  M  x  M  positive  definite  Hermitian  Rv,  min¬ 
imize  (3.13)  by  selecting  bj ^  (bit  loading)  subject  to  (3.8),  selecting  Xk  (subchannel 
assignment) ,  M  x  M  nonsingular  S0  (transformation  selection)  and  M  x  k  matrix  A 
(redundancy  removal  selection,  under  (2.7)). 
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We  shall  adopt  a  multi-step  strategy  to  solve  the  above  optimization  problem. 
First,  for  a  given  choice  of  S0,A  and  Xk,  we  shall  minimize  Pb  by  determining  the 
optimum  bj under  the  rate  constraint  (3.8).  This,  as  will  be  shown  in  Section  4.1,  will 
result  in  a  lower  bound  for  PB,  denoted  Pbopt,  under  optimal  selection  of  bJ:k.  The 
expression  for  Pbopt  will  itself  be  shown  to  be  independent  of  bj,k  and  the  strategy  will 
then  be  minimize  Pbopt  through  selection  of  the  remaining  optimization  variables. 
In  the  next  step,  assuming  fixed  A ,  which  in  turn  fixes  Rw,  we  shall  minimize  Pbopt 
by  the  selection  of  S0  and  Xk.  It  will  be  shown  in  Section  4.2  that  regardless  of  the 
choice  of  A,  Pbopt  reduces  to  an  expression  that  is  determined  by  the  eigenvalues  of 
Rw.  These  eigenvalues  are  determined  exclusively  by  A  and  are  independent  of  the 
choice  of  S0.  The  final  step  in  our  optimization  solution  process  will  hence  be  to  find 
an  A  that  renders  these  eigenvalues  to  be  most  favorable. 


4  Optimum  selections 

In  this  section  we  consider  the  optimum  selection  of  the  various  variables.  First 
consider  the  problem  of  bit  loading. 


4.1  Optimum  Bit  Loading 


As  discussed  in  the  earlier  Section,  we  separate  the  optimization  problem  into  dif¬ 
ferent  parts.  First  we  ask:  For  a  given  choice  of  transceiver,  i.e.  given  So,  Si,  and 
a  certain  choice  of  Xk,  what  is  the  optimum  allocation  of  bjjk  so  that  (3.13)  is  min¬ 
imized  under  the  constraint  (3.8).  This  is  a  constrained  optimization  problem  and 
can  be  solved  using  the  Lagrangian  multiplier  method.  In  fact,  using  the  Arithmetic 
Mean-Geometric  Mean  (AM-GM)  inequality  which  states  that  the  Arithmetic  Mean 
of  a  set  of  positive  numbers  is  greater  or  equal  to  its  Geometric  Mean,  with  equality 
if  and  only  if  all  numbers  are  equal,  we  have  that  for  a  given  choice  of  X, k  and  S0,  Si, 
under  (3.8), 


Pb  >  Pbopt  —  dk 

k= 1 


l/«fc 


jGT* 


(4.14) 
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with  equality  iff  for  all  k  and  i,  j 


2  Ckbi-k[SoH  sr1 


o  in 


lS«RwS»]n  =  2^[S^H  S^US0RWS{ 


Hi 

0  J**- 


(4.15) 


This  is  in  turn  equivalent  to  the  optimum  bit  loading  rule: 


4  1 

bj,k  = - 7-  !og2 

'^k  Ck 


[S»HS^%[S0RWS0H] 


33 


i3exk[SoH  S^USoRvS^y/^ 


(4.16) 


The  remaining  variables  S0,  A,lk  must  now  be  selected  to  minimize  Pbopt ■  Note, 
that  while  the  choice  of  these  other  variables  impacts  the  selection  ofbjyk,  Pbopt  itself 
is  independent  of  bjyk.  This  underscores  the  fact  that  the  remaining  variables  can  be 
selected  regardless  of  the  precise  values  of  bj^  obtained  through  (4-16). 

Note  that  Pbopt  is  much  more  complicated  than  its  specializations,  r  =  1,  studied 
in  [7]  and,  nk  =  L  for  all  k  along  with  Sff  S0  =  /,  studied  in  [11], 


4.2  Selection  of  5o,  Z&  and  A 

In  this  Section,  we  address  the  problem  of  designing  the  optimum  transceiver  and 
optimal  subchannel  assignment.  Specifically,  we  have  to  find  the  optimum  .5 0 ,  Ik  and 
A  minimizing  Pbopt  in  (4-14). 

Assume  for  the  moment  that  A  and  hence  Si  has  been  selected.  Note  that  this 
then  fixes  the  autocorrelation  matrix  Rw  in  (3.12).  For  convenience  we  first  consider 
the  minimization  of  Pb  in  (3.13).  Our  goal  will  now  be  to  determine  the  search 
space  for  S0,  i.e.  determine  the  class  of  S0  minimizing  Pb{So),  for  a  given  matrix  A. 
Consider  then  the  following  optimization  problem. 

Problem  4.1  With  M  x  M  positive  definite  Hermitian  R  and  M  x  M  nonsingular 
S ,  ai  >  0,  determine  S  to  minimize 

M 

J(S)  =  Y/at(eJSRSHet)(eJS-HS-1et)  (4.17) 

i= 1 

where  e*  is  the  Mxl  column  vector  with  all  elements  zero  except  for  the  i-th  element 
which  is  unity. 
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Observe  that  (3.13)  has  this  form.  Let  the  positive  definite  Hermitian  R  have  the 
SVD: 


R  =  UK2UH 


(4.18) 


with  A  real,  diagonal  and  U  unitary.  It  is  noteworthy  that  in  all  the  papers  [1],  [7], 
[8],  the  choice  S  =  PUH ,  with  P  a  permutation  matrix  minimizes  Pbopt •  If  S  is 
restricted  to  be  unitary,  then  [1]  shows  that  this  choice  of  S  also  minimizes  (4.17). 
Consider,  however,  the  example  where  a*  =  1,  M  =  2  and  Rw  =  diag  {9, 1}.  Then 
observe  that  J(I)  =  10  but  with 

1/V3  0 
0  1 

J(B)  =  8  <  /(/).  Thus  in  general  S  =  UH  does  not  minimize  (4.17).  However,  we 
will  show  in  the  following  results  that  it  does  minimize  Pbopt- 

The  following  Lemma  shows  that  J  is  invariant  under  any  diagonal  scaling  of  S. 


1 

i  i 

71 

i  -i 

Lemma  4.1  With  J(S )  as  above, 

J(S )  =  J(QS) 

for  any  diagonal  nonsingular  0. 


Proof:  Proved  by  direct  verification.  ■ 

This  in  particular  means  that  one  can  restrict  attention  to  S  such  that  the  diag¬ 
onal  elements  of  S~H S~l  are  1  /«*.  Lemma  4.2  shows  an  important  property  of  this 
equivalent  constrained  optimization  problem. 


Lemma  4.2  With  M  x  M  positive  definite  Hermitian  matrix  R,  M  x  M  nonsingular 
S,  consider  the  minimization  of 

M 

Y,e?SRSHei  (4.19) 

i= 1 

such  that  for  all  i  €  {1, . . . ,  M}  and  some  a. j  >  0, 

ej  S~H  5'_1ej  =  1/cq.  (4.20) 
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Then  the  minimizing  S  obeys  for  some  real  diagonal  T, 


(. SRSh)(SSh )  =  r.  (4.21) 

Proof:  Denote  [S]ki  =  sku  and  let  skf}  and  ski  be  the  real  and  imaginary  parts  of 
respectively.  Note  that 


dS 

T 

dS 

T 

1 

=  jeket  , 

ds{R) 

ubkl 

ds W 

Ubkl 

dSH 

T 

dSH 

T 

=  6lek  , 

=  ~jeiek, 

Ss(R) 

Ubkl 

ds W 

Ubkl 

ds -1 

=  -S-^efS-1, 

ds -1 

=  -jS~lekeJS~l 

ds{R) 

Ubkl 

ds ^ 
Ubkl 

dS~H 

=  -S~HeielS-H, 

dS~H 

=  3S-HeleTkS~H. 

ds{R) 

Ubkl 

ds W 

Ubkl 

Using  real  Lagrange  multipliers  7 *,  one  obtains  the  cost  function 

M  M 

$  =  'fTeJSRSHei  +  ^7^  (ej S~HS~1ei  -  1  /a*)  . 

i= 1  i= 1 

The  minimizing  S  must  obey,  for  all  k,  l 


ST  ST 


Sc 

Ubkl  Ubkl 


=  0. 


(4.22) 


Now 


ST 


ds 


(R) 

kl 


Y  [ e[ekejRSHei  +  ej SRe^e^ 

i=  1 

M 

Yli  e[S~HeielS~HS~1ei  +  ej S~HS~1ekeJ S-1^] 


i= 1 

T  r>oH 


ej  RS  ek  +  e  j  S Ret 


"TS-HS-\Y^)S~Hei  ~  eJS-\Y^ier)S-HS-W 


i= 1 


i=  1 


Then  with  T  =  diagjyi, . . « ,  7m}, 
ST 


ds 


(- R ) 


j'  [i2SH  -  S^rS^S"1]  ek  +  el  [Si?  -  S^S^TS^]  et.  (4.23) 


kl 


Similarly 


=  ej'  -  S^rS^S"1]  ek  -  eTk  [Sfl  -  S^S^TS^]  e,.  (4.24) 


ds 


kl 


13 


From  (4.22),  (4.23),  (4.24),  one  has  for  all  k,l, 

ef  ek  =  0.  (4.25) 

Thus 


RSh  =  S^TS-nS-1  (4.26) 

^ SRShSSh  =  r.  (4.27) 

Hence  the  result.  ■ 

Matrix  S  minimizing  (4.19)  under  (4.20)  thus  has  to  satisfy  (4.21).  Condition 
(4.21)  can  be  rewritten  using  the  following  Lemma. 

Lemma  4.3  Under  the  hypothesis  of  Lemma  f.2,  with  T  =diag{ 71, . . . ,  7 m}> 

7i>0,  i  =  1, . . . ,  M.  (4.28) 

and 

srsh  =  r1/2s-Hs~1r1/2.  (4.29) 

Proof:  Denote 

A  =  SRSh  and  B  =  (, SSH)~\ 

and  observe  that  A  and  B  are  Hermitian  positive  definite.  Then  under  (4.21), 

A  =  TB. 


Further  since  A  and  B  are  both  positive  definite,  for  all  i.  ef  Bei  >  0,  and 

0  <  efTBei  =  7ief Be i: 

implying  7i  >  0.  Hence  T1/2  is  a  diagonal  real  positive  definite  matrix.  Also  as 
A  =  AH ,  B  =  Bh  ,  and  T  is  diagonal, 

TB  =  BT.  (4.30) 

Further  because  of  (4.30),  for  each  i.  j[ 

lAj  =  bij'jj 
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holding  if  and  only  if  either 


(4.31) 


7i=lj 

or 

bij  =  0. 

Now  if  (4.31)  holds  then 

[r  B]ij  =  7 Aj  = 

The  above  clearly  holds  even  under  (4.32).  Thus 

VB  =  r1/2BT1/2 


and  (4.29)  holds. 


(4.32) 


We  now  have  the  following  Lemma,  which  shows  that  at  least  one  S  minimizing 
(4.17)  has  in  fact  a  simpler  expression. 


Lemma  4.4  At  least  one  minimizing  S  for  (f.l  1)  obeys 

SRSh  =  s~Hs-\ 


(4.33) 


Proof:  Because  of  Lemma  4.1,  if  S  minimizes  (4.17)  then  so  does  S  =  T-1/4^,  with 
T  as  in  Lemma  4.3. 

From  Lemmas  4.2  and  4.3,  there  exists  a  positive  definite  diagonal  T  such  that  S 
obeys 


SRSh  =  T  1/25“jff5“1T1/2. 


Then  S  =  T  1//45'  also  minimizes  (4.17),  as 


srsh  =  r-1/4SRSHr~1/4 


_  pl/4  II  lpl/4 

=  s~Hs~1. 


We  now  show  that  the  search  space  of  S0  minimizing  (4.17)  can  be  restricted  to  a 
particular  form.  Consider  then  the  following  result. 
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Lemma  4.5  Let  the  SVD  of  positive  definite  Hermitian  R  be 


R  =  UK2UH  (4.34) 

with  A  real,  diagonal  and  U  unitary.  Then  for  some  unitary  V ,  (4-1 7)  is  minimized 
by 


S  =  VA~1/2UH 


(4.35) 


and  (4-17)  becomes 

M 

./(S)  = 

i=  1 


(4.36) 


Proof:  From  Lemma  4.4,  S  obeys  (4.33).  Let  S  have  the  SVD 

S  =  vfVH  (4.37) 

with  T  positive  definite  real  diagonal,  V,  V  unitary.  Then  (4.33)  holds  if  and  only  if 

VTVhUA2UhVTVh  =  vt~2vH 
&  ( VHU)K2(UHV )  =  f“4  (4.38) 

Since  U  =  VHU  is  unitary,  for  some  permutation  matrix  P, 

f“4  =  PA2Pt 

=  PA“1/2Pt  (4.39) 

Thus 

SRSh  =  VT~2VH  =  VPAPTVH  = 

Thus  with  V  =  VP,  (4.36)  holds.  Further  it  is  easily  checked  that  S  in  (4.35)  achieves 
(4.36).  . 

We  now  return  to  the  minimization  of  (3.13)  with  R  =  Rw  having  the  form  (4.34). 
Because  of  Lemma  4.5,  with  S0  =  V A ~l/2UH ,  for  any  choice  of  Tk  and  bjjk,  and 
subject  to  V  being  unitary 

pB  >  *,E4E2<‘‘|.*([OT']  f  (4,40) 

k= i  jeik 
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(4.41) 


>  min  y'4ni2f‘‘‘,*‘  TT  (\VKVH]  )2/' 

VVH=lf-^.  L  *33 

k=i  jeik 

=  Eft  n  aTk  (4-42) 

k= i  jeifc 

with 

At  =  dknk2^tk'n\ 

and 

ai=[V'AV'«]...  (4.43) 

For  given  j3k  and  ak  consider  the  quantity  in  (4.42).  For  such  a  given  choice,  one  must 
determine  the  arrangement  of  ak  that  minimizes  this  quantity.  Such  an  arrangement 
is  called  an  optimum  arrangement.  The  following  Lemma  characterizes  an  important 
property  of  such  arrangements. 


Lemma  4.6  Consider  for  integers  p,q  >2, 

z = (a  n V’ VnV» 

k= 0  1=0 

with  a,  j3,  ak,  bi  >  0.  Suppose  for  some  i,  j 


Then 


ai  >  bj  and  —  >  — . 

oaj  obn 


g  =  (a  J]  ak.bj)2/p  +  (/3  J]  bk.ai)2/q  <  f. 

k=0,k=£i  k=0,k^j 


(4.44) 


(4.45) 


Further  if  the  cii,bi  obey  the  ordering  a*  >  cq+i,&j  >  6j+ 1,  then  df/dai  <  df/dai+i 
and  df/dk  <  df/dbi+1. 


Proof:  The  derivative  condition  in  (4.44)  holds  iff 

2  (« rich*  2  < E  ncUa  STl 

p  o'  2,r  q  b'j  2/" 

which  is  equivalent  to 

p— 1  1—2/p  q—  1 

<“  n  n  hf. 

k=0,k^i  ”  bj  k=0,k 
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Consider 


f-g  =  (a  n  «t)2/"  K/P  -  *?]  +  W  n 

k=0,k=£i  k=0,k=£j 

„„l-2/P  «-l  .  1-1 


>  -q n  m2/i  K/f  -  <>fr] + w  n  m2/i  w  -  «?/5] 


k=0,k^j 


k=0,k^j 


(P  n  h)2'"  -.(r)  -  1\t)1-2,v  + 1  -  (r)2/' 


Denote 


q  xb/  q  yb 


h(X)  =  -.A  -  -,A1_2/p  +  1  -  \2/q. 
q  q 


Since  a*  >  bj,  to  prove  (4.45),  it  suffices  to  show  that 

h{ A)  =  -A  -  -A  1~2/p  +  1  -  X2/q  >  0, 

q  q 


VA  >  1. 


Indeed  note  that 


and  as  p,  q  >  2, 


Further 


Mi)  =  o, 


h’(  A)  =  -  —  -(1  —  -).A_2/,p  —  -.A2/9-1 

q  q  p  q 

>  -  -  -(1  -  -)  -  -  V  A  >  1 
9  9  p  9 

=  0. 


p  at 


and  since  a*  >  aj+i,  it  follows  that  df  jda^  <  df/dai+i.  Similarly  one  can  show  that 
under  6*  >  fei+i,  one  has  <  d//<96l+i. 

Hence  the  result.  ■ 


We  thus  have 


Pb  >  mm  E  &  II  al  =  Pi 

k= 1  3^xt 


(4.46) 


with  defining  the  optimal  arrangement  of  the  sequence  of  ak.  From  Lemma  4.6 
such  an  optimum  arrangement  requires  that  for  all  a*  >  a/, 

dPl<d_n 

dak  dat  ' 
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Then  using  Theorem  6.1  and  under  the  above  condition,  P £  is  Schur  concave.  Thus 
from  Fact  4,  as  {[VaV^J  -<  {A  i, . . . ,  Am},  the  choice  V  =  I  minimizes  (4.46), 
i.e.  the  minimum  Pb  is  attained  through  the  choice  of  an  orthonormal  transforma¬ 
tion.  Thus  even  under  the  relaxed  conditions  on  S0,  the  best  power  is  achieved  with 
orthonormal  S0. 

Thus,  regardless  of  A  the  best  S0  is  a  Karhunen-Loeve  Transform  of  Rw,  and  the 
a'l  equal  the  eigenvalues  of  Rw.  Then  from  the  definition  of  supermajorization  in 
Definition  6.1  and  Fact  5,  it  follows  that  the  optimizing  A  must  be  such  that  the 
set  of  resulting  eigenvalues  of  Rw  weakly  supermajorize  all  possible  sets  of  attainable 
eigenvalues.  As  Vc  is  unitary  in  (3.12),  the  eigenvalues  of  Rw  are  the  same  as  those  of 


Q(A)  =  A~c 


Rv  U0  Ux 


y  1  U0HRVU0  Uq  RvUx  Im  a  , 

=  AJ1  IM  A  °  °  A;1 

L  J  [  UfRvU0  U*RVU i  J  [_  AH 
=  A;1  [U*RVU0  +  U*RvUxAh  +  AU?RVU0  +  AU^RVUXAH]  A;\4.47) 


Now  observe  that  as  Rv  is  positive  definite,  and  U  =  [U0  Ux]  is  unitary, 

u"  ]  n  r  i  [  KRvUo  Uq  rvux 


Rv  U0  C/i 


(4.48) 


[uf  J  L  J  [  UfRyUo  Ux  RVUX  \ 

is  positive  definite.  Thus,  each  of  the  matrices  U±RVUX,  Uq  RVU0  and  Uq  RVU0  — 
Uq  RvUi  (UfRyUi)  Uf  RVU0,  must  be  positive  definite  and  nonsingular.  Direct 
verification  shows  that  because  of  (4.47), 

0(-4)  =  A;1  {u»RvUq  -  Uq  RvUi  (ufRyU.y1  U? RVU0 

+  U^RVUX  (ufRyU^1  +  A  U?RVUX  AH  +  (l^RyU.y1  U^RyU0  J  A;1. 


Define 


<9i  =  A;1  \u*RvU0  -  U^RyU,  (ufRyU,)  1  UfRyUo]  A;1 


(4.49) 


Q2  =  A;1  \u»RyUx  (u»RvUx)  1  +  a]  UfRyU!  UH  +  (t/f  RVUX)  1  C/f  RVU0]  A"1. 

(4.50) 
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Clearly  Qi  is  positive  definite  and  Q-2  is  positive  semidefinite.  Only  Q2  depends  on 
A.  Defining  A*(f2(A))  as  the  eigenvalues  of  fi(A),  from  Lemma  6.1,  and  Fact  1, 

{xmA))}f=i  ■<  {^(QO  +  A^g,)}" 

=►  -< w  (mqo+mq,)}". 

Since  Xi(Q2)  >  0,  from  Fact  3 

(MQi)  +  < W  (4.51) 

Thus,  from  Fact  2 

{A mmfil  -< W  {\(Qi)}“v  (4.52) 

Thus  the  optimizing  A  is  one  which  forces  Q2  =  0,  i.e. 

A  =  —UqRvUi  (E/fiW)"1 .  (4.53) 

This  is  independent  of  S0  and  the  optimum  bit  rate  allocations  bij.  Instead  it  is 
determined  exclusively  by  Rv,  provided  by  the  second  order  statistics  of  v(k),  and  U,t 
provided  by  the  SVD  of  the  blocked  channel-equalizer  combination  matrix  C  (z).  The 
resulting  value  of  Rw  is 

Rw  =  Vck-Cl  \u?RvU0  -  Uq  RvUi  (u^RyU.y1  U?RVUQ ]  A^V/.  (4.54) 

To  summarize,  the  optimizing  A  is  obtained  directly  using  (4.53)  with  the  channel 
characteristics  supplying  U,t  and  the  second  order  statistics  of  v(k)  supplying  Rv.  This 
gives  Rw  from  (4.54).  S0  is  then  provided  by  the  eigenvectors  of  Rw  permuted  so  that 
with  <7g .  the  eigenvalues  of  Rw .  an  optimum  arrangement  characterized  by  Lemma  4.6 
is  attained.  This  gives  the  requisite  a^jk  and  (4.16)  gives  the  optimum  bit  allocations 
bj,  fe¬ 
lt  is  interesting  to  note  that  the  solution  of  A  is  identical  to  that  given  in  [7] 
for  the  single  user  case.  Modulo  the  permutation  required  to  enforce  the  optimum 
rearrangement  requirement,  the  optimizing  S0  is  also  the  same  as  for  the  single  user 
case.  Thus,  even  though  the  optimum  bit  rate  allocations  differ  in  the  single  and 
multiuser  settings,  the  transceiver  itself  is  identical. 
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To  summarize,  we  showed  that  there  is  a  conceptual  separation  between  the  three 
selections,  i.e.  the  optimizing  A  is  determined  exclusively  by  Rv,  provided  by  the 
knowledge  of  the  interference  and  channel-equalizer  characteristics;  S0  is  determined 
entirely  by  A  and  the  channel-equalizer  characteristics;  Xk  are  determined  entirely  by 
Re,  in  turn  provided  by  Si  and  S0,  and  the  bit  allocations  are  determined  once  the 
above  quantities  are  found.  Further  as  noted  in  the  Introduction,  we  showed  that 
without  loss  of  generality,  the  optimizing  S0,  G0  are  unitary. 

5  Simulation  results 

In  this  section,  we  compare  the  transmitting  power  of  the  DFT  based  DMT  under 
no  optimum  bit  allocation  and  optimum  bit  allocation  with  an  optimum  unitary 
transceiver.  We  assume  the  equalized-channel  to  be  C(z )  =  1  +  0.5;?-1,  and  a  noise 
source  v(n )  whose  power  spectral  density  is  shown  in  fig.  3.  We  assume  the  DMT 
system  supports  two  user  services.  Both  services  employ  QAM  modulation  schemes, 
and  the  target  rates  for  the  two  users  are  600  Kbps  and  1  Mbps  respectively.  The 
(i,  j)  on  the  x-axis  of  the  plot  indicates  that  user  1  and  2  were  respectively  allocated 
i,j  number  of  channels.  The  plot  shows  that  there  is  a  10  dB  saving  in  transmit 
power  with  our  design  over  the  DFT  based  DMT  under  optimum  bit  allocation,  and 
a  14  dB  improvement  over  the  conventional  DMT  with  no  optimum  bit  allocation. 

6  Conclusions 

In  this  paper,  an  optimum  bit  allocation  strategy  and  design  of  a  general  biorthog- 
onal  DMT  multicarrier  transceiver  system  employing  zero  padding  redundancy  were 
presented,  for  minimizing  the  transmit  power  when  different  users  with  varied  QoS 
requirements  are  supported  and  are  assigned  potentially  different  number  of  sub¬ 
channels.  We  showed  that  no  gains  in  transmit  power  can  be  obtained  by  considering 
biorthogonal  transceivers  over  orthogonal  transceivers.  Our  results  also  showed  that 
should  the  channel/interference  remain  invariant  after  the  initial  connection  is  estab- 
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Power  spectral  density  of  noise 


Figure  3:  Comparison  of  transmit  power  levels. 

lished,  then  only  bit  loading  and  subchannel  selection  need  be  updated  in  response  to 
changing  traffic  needs.  The  optimum  transceiver  itself  depends  only  on  the  channel 
and  interference  conditions  and  not  on  the  QoS  requirements.  Indeed  to  within  a 
permutation  of  subchannels,  the  optimum  transceiver  obtained  here  is  identical  to 
that  obtained  in  [7],  [11]. 


Appendix 

Relevant  results  from  the  theory  of  majorization  are  stated.  First  consider  the  defi¬ 
nition  of  majorization. 

Definition  6.1  [Definition  of  majorization]  Consider  the  following  two  sequences 
x  =  {xi, . . . ,  xn}  and  y  =  {yi, . . . ,  yn}  with  the  ordering  x*  >  Xj+i  and  y \  >  "yi+i- 
Then  we  say  that  y  majorizes  x,  denoted  as  x  -<  y ;  if  the  following  holds  with  equality 
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at  k  =  n 

k  k 

^2xi<^2yi,  1  <k<n. 

i= 1  i= 1 

We  say  that  y  weakly  supermajorizes  x7  denoted  x  -<w  y,  if 

n  n 

1  <.?<”■ 

i=j  i=j 

We  also  have  the  following  facts. 

Fact  1  If  x  -<  y,  then  x  -<AF  y. 

Fact  2  //x  -<w  y  and  y  -<W  z,  t/ien  x  -<w  z. 

Fact  3  Suppose  a  =  {a-i, . . . ,  an},  ai  >  0,  then  (x  +  a)  -<M'  x. 

One  of  the  examples  of  majorization  in  matrix  theory  is  the  comparison  between 
the  diagonal  elements  and  eigenvalues  of  a  Hermitian  matrix.  The  following  is  a 
general  result  that  holds  for  Hermitian  matrices. 

Fact  4  If  H  is  an  n  x  n  Hermitian  matrix  with  diagonal  elements  {/ii, . . . ,  hn }  =  h, 
and  eigenvalues  { Ai, . . . ,  An}  =  A,  then  h  -<  A  on  Rn. 

The  following  Lemma  gives  a  comparison  of  eigenvalues  of  matrices  and  sums  of 
matrices. 

Lemma  6.1  Consider  two  M  x  M  Hermitian  matrices  Qi  and  Q-2 ■  Suppose  the 
eigenvalues  of  Qi,  Q2  and  Q\  +Q2  are  respectively  Aj(Qi),  A j(Q2)>  and \{Qi  +  Q2), 
Aj(-)  >  Aj.  ](•).  Then 

{Aj(Qi  +  Q2)}iii  -<  (Aj(Qi)  +  Ai(Q2)}^i- 

Functions  that  preserve  the  ordering  of  majorization  are  said  to  be  Schur  convex. 
Note  the  trivial  fact  that  a  function  <p  is  Schur  convex  if  and  only  if  —(f)  is  Schur 
concave.  The  following  defines  Schur  concave  functions. 


(6.55) 


(6.56) 
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Definition  6.2  [Definition  of  Schur  concavity]  A  real  valued  function  <j>(z)  = 
<p(zi, . . . ,  zn)  defined  on  a  set  A  C  Rn  is  said  to  be  Schur  concave  on  A  if 

x  -<  y  on  A  =>  (f)fx)>4 >(y). 

(f>  is  strictly  Schur  concave  on  A  if  strict  inequality  fifx.)  >  <fi(y)  holds  when  x  is  not 
a  permutation  of  y. 

A  useful  condition  for  verifying  if  a  given  function  (f>  is  Schur  concave  is  now 
considered.  The  following  theorem  results  in  a  test  for  strict  Schur  concavity. 

Theorem  6.1  Let  f(z)  be  a  scalar  real  valued  function  defined  and  continuous  on 
»  =  {(*!,  •••.*.)  '■  Zi  >  . . .  >  zn},  and  differentiable  on  the  interior  ofV.  Then  <p(z) 
is  Schur  concave  on  V  if  is  increasing  in  k. 

Fact  5  Suppose  f(z)  satisfi.es  the  conditions  of  Theorem  6.1.  Then  fifx)  >  f(y) 
whenever  x  -<w  y. 
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ABSTRACT 

In  an  earlier  paper  we  had  presented  a  novel  dual  channel  iden¬ 
tification  approach  for  mobile  wireless  communication  systems. 
Unlike  traditional  channel  estimation  methods  that  rely  on  train¬ 
ing  symbols,  this  approach  used  a  bent-pipe  feedback  mechanism 
requiring  the  mobile  station  (MS)  to  send  portions  of  its  received 
signal  back  to  the  Base  Station  (BS )  for  wireless  channel  identifi¬ 
cation.  Using  a  filter-bank  decomposition  concept,  we  introduced 
an  effective  algorithm  for  identifying  both  the  forward  and  the  re¬ 
verse  channels  based  only  on  this  feedback  information.  This  new 
method  permits  transfer  of  computational  burden  from  the  MS  to 
the  resource  rich  BS  and  leads  to  significant  savings  in  bandwidth 
consuming  training  signals.  This  paper  proposes  a  more  informa¬ 
tive  feedback  method  leading  to  significant  performance  improve¬ 
ment  over  our  earlier  scheme. 

1.  INTRODUCTION 

Two  important  tasks  in  mobile  wireless  communications  systems 
are  channel  estimation,  and  compensation  aided  by  frequent  trans¬ 
mission  of  training  signals.  In  most  future  cellular  systems  the 
forward  link,  carrying  data  from  Base  Stations  (BS)  to  a  Mobile 
(MS),  will  support  higher  data  rates  than  the  reverse  link.  Con¬ 
sequently,  the  estimation  and  compensation  of  the  Forward  Link 
Channel  (FLC)  requires  more  resources  and  longer  training  se¬ 
quences  than  that  of  the  Reverse  Link  Channel  (RLC).  Equally, 
the  current  practice  is  to  assign  the  compensation  and  estimation 
of  the  FLC  entirely  to  the  MS,  which  generally  has  less  computa¬ 
tional  reserves  than  the  BS. 

To  permit  the  resource  rich  BS  to  share  in  the  compensation 
of  the  FLC,  and  to  reduce  the  bandwidth  consuming  training  of 
the  FLC,  in  [2]  we  proposed  a  new  approach  to  the  estimation  and 
compensation  of  the  FLC  in  mobile  wireless  communication  sys¬ 
tems  using  a  novel  bent  pipe  feedback  mechanism.  In  principle, 
this  feedback  mechanism  enables  the  BS  to  estimate  and  compen¬ 
sate  both  the  FLC  the  RLC,  without  any  training  signals  on  either 
link  or  resort  to  blind  estimation  techniques.  While  practical  reali¬ 
ties  temper  these  expectations,  as  we  demonstrated  in  [2],  this  idea 
has  significant  advantages. 

Specifically,  the  approach  of  [2]  requires  that  the  MS  feed  back 
to  the  BS  a  portion  of  the  received  signal,  over  the  time  slot  con¬ 
ventionally  reserved  for  RLC  training.  Clearly,  this  permits  the  BS 
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to  estimate  the  Roundtrip  Channel  (RTC).  However,  the  key  nov¬ 
elty  of  our  approach  lies  in  the  following  discovery:  By  feeding 
back  only  a  portion,  rather  than  the  entire  received  signal,  one 
empowers  the  BS  to  identify  both  the  FLC  and  the  RLC  from  the 
roundtrip  feedback  signal  alone.  This  novel  channel  feedback  does 
not  require  high  speed  reverse  links  and  naturally  accommodates 
asymmetric  data  link  structures,  and  structures  where  the  RLC  and 
FLC  have  different  carrier  frequencies.  Furthermore,  no  additional 
training  signals  are  necessary  for  estimating  the  RLC  at  the  BS, 
though  some  training  for  synchronization  will  still  be  needed. 

As  the  BS  will  miss  changes  in  the  FLC  that  occur  within 
feedback  latency,  the  MS  must  estimate  and  compensate  the  resid¬ 
ual  ISI  in  the  channel  dynamics  the  BS  cannot  compensate.  Over 
reasonable  distances  and  mobile  speeds  these  changes  are  modest 
enough  to  make  the  partially  precompensated  FLC  dynamics  rel¬ 
atively  mild.  Thus,  a  5km  roundtrip  causes  a  feedback  delay  of 
16.67  ps,  a  time  span  over  which  the  FLC  undergoes  little  change. 
This  is  underscored  by  the  fact  that  in  GSM  each  data  frame  has  a 
duration  of  557  ps,  and  training  occurs  only  once  per  data  frame. 
Thus  the  channel  variation  within  the  resolution  of  this  delay  oc¬ 
curs  mainly  because  of  Doppler  effect.  Yet  a  vehicle  traveling  at 
lOOkm/hr  suffers  a  maximum  Doppler  shift  of  55  Hz  in  the  cel¬ 
lular  band;  a  shift  not  large  enough  to  cause  drastic  changes  in 
the  FLC  characteristics  over  latencies  of  tens  of  microseconds. 
Thus  the  residual  ISI  that  must  be  equalized  at  the  receiver  will 
be  significantly  milder  leading  to  the  need  for  much  shorter  train¬ 
ing  sequences  on  the  FLC.  Given  that  no  training  for  estimation  is 
needed  on  the  RLC,  and  that  feedback  data  occupies  the  RLC  train¬ 
ing  slot  used  in  conventional  communication,  this  implies  substan¬ 
tial  savings  in  the  bandwidth  devoted  to  the  overall  training,  and  a 
significant  transfer  of  the  FLC  compensation/estimation  burden  to 
the  BS.  Simulations  presented  in  [2]  support  this  contention.  This 
scheme  also  permits  the  use  of  such  adaptive  coding  at  the  BS  as 
has  been  advocated  by  several  authors  [3]-  [6], 

The  scheme  of  [2]  is  preliminary  in  nature.  One  of  its  disad¬ 
vantages  is  that  it  fails  to  make  as  efficient  a  use  of  the  feedback 
slot  as  is  desirable.  In  particular  a  large  fraction  of  the  feedback 
slot  carries  zero  samples  that  do  not  contain  useful  information 
about  the  FLC.  The  key  contribution  of  this  paper  is  to  formulate  a 
more  informative  feedback  scheme  that  carries  more  information 
about  the  FLC  leading  to  improved  FLC  estimation. 

2.  THE  FEEDBACK  SCHEME 

For  the  most  part  in  this  paper  we  assume  that  the  ratio  of  the  data 
rates  supported  on  the  FLC  and  RLC  is  M/L,  with  M  >  L.  Later 


we  will  comment  on  how  to  accomodate  the  case  of  M  <  L. 

In  fig.  1  H(z)  and  G(z)  are  the  discrete  time  baseband  models 
of  the  FLC  and  RLC  respectively,  Wi(n)  are  the  noise  sequence 
at  their  output,  x(n)  and  y(n)  are  respectively  the  data  sequence 
transmitted  and  received  by  the  BS.  The  samples  ,si  (n),  received 
at  the  MS  are  rate  converted  by  the  IV -branch  rate  convertor,  [1] 
that  generates  L  samples  for  every  M  samples  at  its  input,  i.e. 
effects  a  rate  conversion  by  a  factor  of  L/M.  The  sequence  S2(n) 
is  retransmitted  over  the  RLC  over  the  slots  usually  reserved  for 
RLC  training:  u(k )  models  the  interference  caused  by  the  normal 
RLC  data  because  of  imperfect  synchronization. 

In  this  arrangement  N  <  L  <  M.  Effectively,  over  sample 
lengths  of  L,  s2(n )  contains  N  out  of  every  M  samples  of  si  (n), 
and  has  in  addition  L  —  N  zeros.  The  scheme  in  [2]  uses  only  the 
top  branch  of  this  arrangement ,  i.e.  has  IV  =  1.  Consequently  in 
[2]  out  of  every  L-symbol  feedback  slot  only  one  sample  contains 
the  data  received  at  the  MS,  with  the  remaining  L  —  1  symbols 
being  zero  samples.  Thus  in  [2]  the  available  feedback  slot  is  under 
utilized  as  far  as  information  exchange  is  considered.  This  causes 
important  information  to  be  unnecessarily  discarded,  reducing  the 
ability  to  track  time  variations,  resulting  in  larger  residual  ISI  in  the 
FLC  compensated  on  the  basis  of  the  estimate  at  the  BS.  As  we  will 
demonstrate  in  Section  4,  the  more  sophisticated  rate  convertor 
with  N  =  L,  leads  to  improved  performance. 

Consider  the  M  and  L  fold  type  I  and  II  polyphase  decomposi¬ 
tions  of  JT(s)  and  G(z)  respectively,  i.e.  H(z)  =  Y2t=o 1  Ei  (zM)z~' 
and  G(z)  =  R,(zL)z^^L^‘^1K  Then,  [1],  absent  noise 

and  interference.  Fig.  1  can  be  transformed  into  Fig.  2,  where 
R(^)  and  E(z)  are  respectively,  left  and  right  pseudocirculant  ma¬ 
trices  given  by 


(  Ro(z) 

Ri(z) 

Rjv_i(s)  \ 

Ri(z) 

R2(z) 

Rn(z) 

R(*)  = 

\  Rl-i{z) 

z^Roiz) 

z~1RN-2(z)  / 

and  E  ( z ) ; 

= 

/  E0(z) 

Eii*) 

■■■  -Em-i (2)  \ 

.y'  1  E,m  i(s) 

E0(z) 

•••  EM-2(z ) 

\  z^1  Em-n+i  (^) 

Z  E,\)  2 

Em-n  j 

Fig.  1.  System  model  of  improved  scheme:  Rate  changer  with  N 
branches. 


Fig.  2.  Polyphase  Representation. 


Assumption  1  The  greatest  common  divisor  (gcd)  of  the  set  of 
polynomials  R ,  (z)  is  a  pure  delay  z~d  (d  integer).  Further  their 
maximum  order  Ir  is  known. 

Assumption  2  The  gcd  of  set  of  the  set  of  polynomials  E,  (z)  is 
a  pure  delay  z~d  (d  integer).  Further  their  maximum  order  IrJS 
known. 

To  see  why,  observe  that  in  the  setting  of  [2],  i.e.  N  =  1,  (5) 
is  replaced  by  Y(z)  =  R(z)E(z)X.( z).  Thus  the  rank-1  matrix 
R(z)E(z)  can  be  estimated.  Observe  the  fc-th  row  of  R(z)E(z)  is 
simply,  Rk(z)[E0(z),  ■  ■  ■ ,  Em-i{z)\.  Under  Assumption  2,  the 
gcd  of  the  elements  of  this  row  provides  to  within  a  delay  and  scal¬ 
ing,  Rk{z)  and  hence  also  Et  (s)  andiT(i).  Similar  unraveling  is 
possible  should  Assumption  1  hold. 

Observe  in  the  setting  of  this  paper  the  rank-1  matrix  R(z)E(z) 
is  not  directly  available.  Yet  in  the  next  two  sections,  we  show  that 
under  either  Assumption  1  or  2,  H(z)  and  G(z)  can  be  obtained 
to  within  a  scaling  and  delay  from  the  the  roundtrip  dynamics  cap¬ 
tured  by  R(s)E(s). 


Define 

E(z)  =  [E0(z),  -  ■  ■  ,EM-i{z)\  (3) 

R(z)  =  [R0(z)1---,RL-1(z)]t.  (4) 

Define  in  Fig. 2  Y(z)  =  [Yj  (z),  ■  ■  ■ ,  Yr  (^)]t  and  X(z)  = 
[.Y i  (.?).  •  •  • .  X \i  (•■)] 1  where  A',  (^)  and  Y,  (2)  are  the  ^-transform 
of  Xi(n)  and  yt  ( n ).  Then  we  have  the  following  relation 

Y  (z)  =  R(^)E(s)X(z).  (5) 

Since  X ,  ( 2 )  and  Y,  ( z)  are  known  to  BS,  the  round  trip  chan¬ 
nel  R(s)E(s)  can  be  estimated.  The  question  is,  under  what  con¬ 
ditions  can  one  extract  H(z)  and  G(z)  from  R(^)E(i).  Clearly 
the  best  one  can  hope  for  is  to  estimate  R(s)  and  E(^)  to  within  a 
scaling  constant  and  common  delays  among  the  Et (z)  and  Rl{z). 
In  the  case  of  [2],  such  an  extraction  is  possible  if  either  (not  nec¬ 
essarily  both)  of  the  following  conditions  apply. 


3.  PROOF  OF  IDENTIFIABILITY 

In  this  section  we  show  that  E(z)  is  identifiable  to  within  a  scaling 
constant,  from  R(i)E(^),  when  M  >  L  >  N,  and  assumption 
2  holds  with  the  common  delay  d  among  the  E,  equalling  zero. 
In  section  3.3,  we  discuss  the  case  where  this  common  delay  is 
nonzero.  The  knowledge  of  E(z)  provides  H(z).  A  similar  result 
can  be  formulated  when  assumption  1  holds  or  when  L  >  M  >  N 
or  when  L  =  M  >  N.  Thus  even  the  case  of  L  =  M  can 
be  captured.  In  each  case  the  selection  of  N  ensures  that  some 
received  signal  is  discarded  and  s2(n)  si  (n  —  k). 

3.1.  Definitions  and  notations 

For  an  M  x  N  polynomial  matrix  A{z)  =  y^'._  A(i)z~l ,  where 
/  is  the  degree  of  A(z).  define  the  mM  x  (l  +  m)N  generalized 


Sylvester  matrix  of  A(z)  as 


/  A(0)  ■  ■  ■  A(l)  \ 

Tm(A)  =  j  .  (6) 

V  A(o)  •••  A(l)  ) 

For  the  r  x  Mm  matrix  D  =  [Z?(0),  ■  •  •  ,  B(m  —  1)],  define 

m  —  1 

^  B(i)z  \  (7) 

1=0 

where  B(i)  and  Vm,M(B )  have  dimension  r  x  M.  Note  that 
Vm.M(B)  is  a  function  of 

Define  an  M  x  M  polynomial  matrix  Qm(z)  as 


Find  B  whose  rows  span  the  left  nullspace  of  Tm( CT),  where 
rn  >  NIe  +  N  —  1,  and  construct  B(^r)  defined  in  (10).  Be¬ 
cause  of  the  assumption  of  lack  of  correlation  between  x(n)  and 
the  noise  and  interference,  B  is  provided  by  Vn.  Then  solving  for 
Ti(E)  as  the  eigenvector  corresponding  to  the  smallest  eigenvalue 
of  Tie  (B)Tie  (B)^,  where  (-)H  indicates  transpose  conjugate, 
provides  E(z).  Since  Tir  +  i  (E)  has  full  row  rank  one  finds  G(z) 
also  to  a  scaling  constant,  using 

T1(R)=T1(C)(7]Ji+1(E))<. 

If  Ei  (2)  have  a  common  delay  then  this  manifests  in  certain  columns 
of  Tm  ( CT)  being  zero.  Then  applying  the  above  procedure  on  the 
matrix  with  these  zero  columns  removed  provides  H(z)  and  G(z) 
to  within  a  scaling  and  a  delay. 


Qm(z) 


Z  ^ 


Im-1 


(8) 


where  Im-i  denotes  ( M  —  1)  x  ( M  —  1)  identity  matrix;  0mXn 
denotes  the  m  x  n  zero  matrix. 


3.2.  Identifiability 

With  C(z)  =  R(z)E(z), 

Tm  (CT)  =  Tm  (ET)TlE  +  l  +  m  (RT)  (9) 

Whenever  G(z)H(z)  ^  0,  TmjR1  )  and  Tm  (E)  are  full  rank 
for  all  intgers  m  >  0.  Flence  Tm  ( CT)  and  Tm  (ET)  have  iden¬ 
tical  left  nullspaces.  Thus  the  knowledge  of  C(2)  provides  the 
left  nullspace  of  Tm  (ET).  The  following  theorem  shows  that  the 
left  null  space  of  Tm  (ET)  under  assumption  2  provides  H(z)  to 
within  a  scaling. 


4.  SIMULATIONS 

We  present  two  simulation  examples.  The  first  example  shows  the 
basic  performance  of  the  scheme  in  this  paper  relative  to  that  of 
[2].  The  second  example  illustrates  the  reduction  of  training  levels 
needed  on  the  FLC  when  channel  parameters  change  with  time. 

4.1.  Simulation  I 

In  the  simulation,  FLC  and  RLC  are  generated  from  two  delayed 
raised-cosine  pulse  C(t,  a),  where  a  is  the  roll-off  factor.  C(t,  a) 
is  limited  in  8T  for  FLC  and  in  6T  for  RLC.  The  FLC  and  RLC 
have  the  respective  analog  models:  0.3C(t,  0.25)  +  0.8C(t  — 
T/ 2,0.25)  and  RLC  =  0.5C(f,  0.10)  +  0.6C(f- T/3, 0.10).  We 
use  downsampling  factor  M  =  3  and  upsampling  factor  L  =  2 
and  N  =  L.  Noises  w2  (n)  and  w2(n)  are  zero  mean  and  have 
the  same  variance.  The  input  signal  x(n )  is  i.i.d  BPSK.  To  ob¬ 
tain  a  performance  measure  of  channel  estimation,  we  define  the 
normalized  root-mean-square  error  (NRMSE)  as 


Theorem  1  Suppose  E(z)  and  E)^)  are  defined  in  (3)  and  (2) 
respectively  and  assumption  2  is  in  force,  with  d  =  0  and  M  > 
L  >  N.  Then  for  any  integer  m  >  NIe  +  N  —  1,  Tm  (ET)  has  a 
nontrivial  left  null  space.  Suppose  B  is  a  matrix  whose  rows  span 
the  left  nullspace  ofTm  (ET).  Let 

B(t»)  =  [(V  m,M(B))T,---,Q%r1(Vm,M(B))T].  (10) 

Consider  an  M -dimensional  nonzero  polynomial  row  vector  E(z) 
with  degree  Ie-  Then 


NRMSE  = 


\ 


•flat 


(■) 


(13) 


where  Mt  is  the  number  of  Monte  Carlo  runs;  h  is  the  actual  chan¬ 
nel  and  /?(;)  is  the  i  estimation.  In  our  simulation  Mt  =  100. 
In  each  ran  600  symbols  are  used.  We  call  the  scheme  in  [2]  old 
scheme  and  the  scheme  in  this  paper  new  scheme.  Fig.  3  shows 
NRMSE  versus  input  SNR,  where  input  SNR  is  defined  as 


Ti(E)T,E(B)  =  0ijfE(z)  =  cE(z)  (11) 


Input  SNR  =  101ogl0E(^2(u))/E(uq(u))  (14) 


where  c  is  a  nonzero  constant. 

Thus  indeed  the  left  null  space  of  7iB(B),  constructed  from  the 
left  nullspace  of  7"m  (CT),  provides  E(z)  to  a  scaling. 

3.3.  A  Subspace  Algorithm 

Assume  assumption  2  holds  with  d  =  0.  Suppose  the  signals  x (n), 
u(n),  wi(n)  and  w2  ( n )  are  zero  mean,  white  and  mutually  uncor¬ 
related.  In  view  of  the  noise  free  model  of  (5),  and  the  knowledge 
of  x(n)  and  y(n)  at  the  BS,  a  standard  least  squares  scheme  pro¬ 
vides  an  estimate  of  Tm  (CT)  and  hence  its  SVD: 

T„(C*J  +  JV  =  (V.V.)(S0-  £)($)  (12) 


A  zero-forcing  preequalizer  is  constructed  for  FLC  and  a  post 
equalizer  for  the  RLC,  on  the  basis  of  the  FLC  and  RLC  estimates. 
Both  are  housed  at  the  BS.  Equalizer  SNR  for  the  FLC,  is  com¬ 
puted  using  the  signal  power  at  the  FLC  input.  For  RLC,  it  uses 
the  signal  power  at  the  equalizer  output.  BER  versus  equalizer 
SNR  is  displayed  in  Fig.  4,  where  the  input  symbol  is  BPSK  sig¬ 
nal.  Both  figures  show  the  performance  improvement  due  to  the 
more  informative  feedback  of  this  paper. 

4.2.  Simulation  II:  Reduced  training  for  a  time  varying  chan¬ 
nel 

We  use  COST-207  Typical  Urban(TU)  [7]  model  with  100  echo 
paths,  BPSK  data  and  maximum  Doppler  frequency  55Hz.  We 
assume  the  channels  to  be  quasistatic,  i.e.,  time-invariant  in  one 
frame  and  time- variant  from  frame  to  frame.  The  receive  filters 


BER  NRMSE 


Fig.  3.  NRMSE  versus  input  SNR  for  both  schemes 


Fig.  4.  BER  versus  equalizer  SNR  for  both  schemes. 


for  FLC  and  RLC  are  raised  cosine  functions  with  roll-off  factors 
0.2  and  0.1  respectively.  The  FLC  sustains  a  data  rate  of  1  Mbps, 
and  the  RLC  supports  0.667  Mbps.  We  use  downsampling  factor 
M  —  3  and  upsampling  factor  L  =  2.  The  schemes  with  N  =  1 
and  N  =  2  are  called  old  scheme  and  new  scheme  respectively.  In 
both  cases  we  compare  two  settings: 

(a)  Training  aided  equalization  of  FLC  at  the  MS  and  of  the 
RLC  at  the  BS,  with  no  feedback. 

(b)  No  training  on  the  RLC,  but  instead  sending  feedback  data 
of  the  same  length  as  the  RLC  training  data  in  (a).  A  prec¬ 
ompensator,  obtained  by  the  new  scheme  or  the  old  scheme 
is  used  on  the  FLC  and  is  augmented  by  a  post-equalizer 
estimated  at  the  receiver  using  reduced  training. 

Methods  in  (a)  and  (b)  use  the  same  signal  power  at  the  FLC 
input.  Fig.  5  shows  ni/nt  versus  input  SNR  for  methods  in  (a) 
and  (b)  to  achieve  the  same  BER.  Here  nt  is  the  length  of  the 
FLC  training  sequence  used  in  (a)  and  nj  the  length  of  training 
used  in  (b),  so  that  the  same  FLC  BER  is  obtained  in  both  cases. 
As  is  evident  in  Fig.  5,  in  order  to  achieve  the  same  BER  as  the 
conventional  method,  the  new  scheme  needs  less  training  data  and 


thus  saves  more  bandwidth  than  the  old  scheme  when  input  SNR> 
6dB.  By  way  of  further  comparison  we  note  that  at  18dB  SNR, 
while  the  training  sequence  length  in  [2]  is  30%  of  (a),  the  length 
for  the  new  scheme  is  only  15%,  i.e.  half  that  needed  by  [2]. 


Fig.  5.  ni/nt  versus  SNR  at  the  same  BER. 


5.  CONCLUSION 

In  this  paper,  a  bent  pipe  multi-branch  feedback  scheme  is  used 
to  estimate  FLC  and  RLC  from  RTC.  By  exploiting  the  properties 
of  the  nullspace  of  pseudocirculant  matrices,  the  identifiability  re¬ 
sult  and  unravelling  method  are  derived  for  this  improved  scheme. 
Since  this  improved  scheme  uses  more  information  from  FLC,  we 
get  performance  improvement  compared  to  the  scheme  in  [2], 
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Abstract 

In  this  paper,  we  provide  a  complete  characterization  of  Discrete  Multitone  Transmission  (DMT) 
systems  that  employ  a  cyclic  prefix  redundancy,  and  can  be  equalized  by  a  bank  of  one  tap  equalizers 
in  each  subchannel,  for  almost  all  values  of  channel  parameters.  We  show  that  amomg  all  possible  FIR 
transmitting  and  receiving  filters  of  arbitrary  order,  such  channel  resistant  transmission  requires  (a)  that 
the  receive  filters  be  matched  to  the  transmit  filters,  and  (b)  that  to  within  a  scaling  and  delay,  the 
transmit  and  receive  filters  have  IDFT  and  DFT  coefficients.  Thus  we  prove  that,  should  cyclic  prefix 
be  applied,  only  trivial  variants  of  traditional  DFT  based  DMT  systems  are  channel  resistant. 
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1  Introduction 


Discrete  multitone  (DMT)  modulation,  depicted  in  fig.  1  is  a  standard  in  many  wireline  and  wireless 
applications,  [1].  An  M-point  block  transformation,  A(z )  is  applied  to  M-parallel  data  streams,  followed 
by  a  parallel  to  serial  conversion  (block  P  to  S).  For  an  FIR  channel  with  transfer  function, 

K 

H(z)  =  ^2hiZ-L,  (1.1) 

i= 0 

a  cyclic  prefix  redundancy  of  length  k  is  added  by  the  CPI  block,  and  is  removed  by  the  CPR  block.  Thus 
each  M-block  of  v(n)  is  converted  to  an  IV-block  of  s(k),  N  —  k  +  M,  by  prepending  the  last  k  samples 
of  the  block.  After  CPR,  one  performs  serial  to  parallel  conversion  (StoP  block)  ,  and  the  inverse  block 
transformation,  B(z).  The  overall  system  has  the  equivalent  description  of  fig.  2.  In  traditional  DMT 
A(z)  is  a  block  IDFT  operation,  and  B{z)  is  a  block  DFT  operation,  i.e.  with  W  the  M-point  DFT  matrix 
having  ik- th  element 

[W]ik  -  ,  A(z)  -  WH  and  B(z)  -  W.  (1.2) 


xM-i(n)  yM-i(n)  yM-i(n)  xM-i  (n) 

Figure  1:  The  DMT  system. 


Under  (1.2),  Ft(z)  and  Gfiz)  are  mutually  matched,  respectively  causal  and  anticausal  of  degree  M  —  1 
with  coefficients  of  Ft(z)  being  the  cefficients  of  the  i-th  column  of  WH .  This  leads  to  considerable  spectral 
overlap  between  the  subchannels.  Cyclic  prefix  and  (1.2)  ensure  that 

X(n)  =  [x0(ra),  ■  ■  ■ ,  xM-i{n)]T  and  X(n )  =  [xo(n),  -  ■  ■ ,  xM-i{n)]T  X(n )  =  A  (ho,  ■  ■  ■ ,  hK)X(n)  (1.3) 

with  diagonal,  A  (/to,  ■  ■  • ,  hK )  nonsingular  for  almost  all  hi  barring  those  for  which  H(z)  has  a  unit  circle 
zero  at  an  M-th  root  of  unity.  Thus,  for  almost  all  channels  of  order  no  greater  than  k,  one  tap  equalizer 
at  each  subchannel  output  ensures  ISI  removal.  We  call  such  a  system  Channel  Resistant  DMT. 


More  general  orthogonal  block  transforms,  with  zero  padding  redundancy,  and  precoding  are  in  [2]-[6] . 
Improved  spectral  separation  through  longer  Ft(z)  and  Gfiz),  is  advocated  in  [7].  Given  the  prevalence  of 
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cyclic  prefix  redundancy  in  practical  systems,  and  the  issues  raised  in  [7],  we  ask  whether  there  are  longer 
length  Fj(z)  and  Gi(z )  that  together  with  cyclic  prefix  redundancy,  ensure  channel  resistance?  We  thus 
characterize  all  M  x  M  FIR 

P2  92 

A(z)  —  ^  AiZ~l  and  B(z)  —  ^  BiZ~l  (1.4) 

i=-pi  i=-qi 

such  that  with  cyclic  prefix  redundancy,  and  H(z )  as  in  (1.1),  (1.3)  holds  with  diagonal  A(/io,  ■  ■  • ,  hK) 
nonsingular  for  almost  all  hj.  We  show  that  all  such  A(z) .  B(z)  yield  F(z )  G(z),  that  are  mutually  matched 
to  within  complex  scaling  constants,  and  are  identical  to  their  counterparts  yielded  by  the  conventional 
DMT,  to  within  scaling  constants  and  delays.  Thus  channel  resistance  with  cyclic  prefix  is  incompatible 
with  greater  spectral  separation. 


2  Formulation 

In  fig.  1  successive  M-blocks  of  v(n)  and  p(n)  and  _/V-blocks  of  r(n)  and  s(n)  obey,  [2]: 

v(n)  =  [■ v{Mn ),  ■  ■  ■ ,  u(Mn  -  M  +  1)]T  =  [yo{n),  ■  ■  • ,  yM-i{n)]T  =  A(z)X(n) 

0  IK 


s (n)  —  [s(Nn),  ■  ■  ■ ,  s(Nn  —  N  +  1)] 


-  / 


v(n) 


(2.5) 

(2.6) 


r(n)  =  [r{Nn),  ■  ■  ■  ,r(Nn  —  N  +  1)]T  — 


p(n)  =  \p(Mn),  ■  ■  ■  ,p(Mn  —  M  +  1)]T  — 


ho  z  '//.v  |  2  hx.  2 

hi  ho  z~1hx~i 

hN- i  hx- 2 


z~xh 

z~xh 


h\  ho 


0  I 


M 


r(n)  X{n)  —  B(z)p(n) 


s(n)  (2.7) 


(2.8) 


Define  circulant(r/)  to  be  the  square  circulant  matrix  with  first  row  rj.  Then,  [2],  one  has 


hi  —  circulant 


ho  0  ...  0  hK  ...  hi 


,  X(n)  -  B(z)HA(z)X(n). 


(2.9) 


with  hi  M  x  M.  Further  we  note,  that  because  of  (2.5)  and  (2.8),  fig.  1  is  equivalent  to  fig.  2  with 


[Fo(z),  ■  ■  • ,  Fm-i(z)}  -  [i.*  A(zm),  [G0(z),  ■  •  •  Gm-i(z)]T  =  [l,z,  ■  ■  ■  z^M~l)]  BT(zM). 

(2.10) 

Channel  resistance  then  requires  that  with  diagonal,  A(/iq,  ■  ■  ■ ,  hK)  nonsingular  for  almost  all  hi, 


B(z)HA(z)  =  A(ho,---,hK). 


(2.11) 
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Observe  with  the  M  x  M  circulant  shift  matrix 


n  =  hoi  +  jm~1k 


i= 1 


J  —  circulant  |  0  1  ...  0 

Thus  (2.11)  requires  that  with  diagonal  A ,■  and  at  least  one  A,  nonsingular 

B(z)A(z)  —  Aq  and  VI  <  i  <  n  B(z)Jm~1A(z )  =  A,-. 


(2.12) 


(2.13) 


3  The  Main  Result 

The  following  Lemma  is  useful. 

Lemma  3.1  Suppose  two  M  x  M  nonsingular  matrices  B  and  A  are  such  that  both  BA  and  BJM~1A  are 
diagonal.  Then  for  some  permutation  matrix  P  and  diagonal  matrices  Db  and  Da,  B  —  DbPWh  and 

A  =  wptda. 

Proof:  As  BA  is  diagonal,  the  expression  for  B  ensures  that  for  A.  Call  A,;  the  M  x  (M  —  1)  matrix 
comprising  all  but  the  i-th  column  of  A  and  6,;  =  [bn,  •  ■  ■ ,  0  the  i- th  row  of  B.  As  rank  (A*)  —  M  —  1, 


bi 

( 

bi 

\ 

A{  —  0  =$■  rank 

=  1 

1 - 

O- 

1 

1 _ 

1 

1 

1 _ 

) 

Thus  for  some  cq  +  0,  [bn,  ■  ■  ■ ,  biM]  —  ai[bi2,  ■  •  -  Arf;  bn].  Thus,  for  1  <  li  <  M,  bu  —  (rthLi .  t  and 
bi  1  =  otibjM ■  Thus  af  =  1  and  b,  =  bn[l,  a*  ■  ■  ■ ,  a"-1].  ■ 

In  (2.13)  as  at  least  one  A,;  is  nonsigular,  B(z),  A(z )  are  nonsigular.  Thus  from  Lemma  3.1,  for  almost  all  2, 
B(z)  —  Db(z)P(z)Wh  and  A(z)  —  WPT (z)DA(z),  with  Db{z),  Da(z)  diagonal  and  P(z)  a  permutation 
matrix  at  almost  all  z.  Observe,  that  for  Dj  nonzero,  diagonal  and  Pt.  P  permutations,  and  D  a  diagonal 
matrix,  D\P\  +  D2P2  —  DP  iff  P\  —  P2.  Thus  P(z)  —  P  is  constant.  Further,  from  (1.2),  D\[z)  — 
J2fL~Pl  B\iZ~l  and  Db(z)  —  YhiL-qi  DsiZ~l .  Further  Db{z)Da(z)  —  Ao-  It  is  readily  seen  that  that  the 
product  of  two,  two  sided  scalar  polynomials  a(z)  and  h(z)  is  a  constant  iff  for  some  l.  a(z )  =  aiz1  and 
b(z)  —  b-iz~l .  Thus,  as  Db(z)  and  DA(z )  are  diagonal,  for  some  constant  nonsingular,  diagonal  A4  and 
A b  and  Z  —  cliag  {z*1 ,  ■  ■  ■ ,  zlM  } 

B(z)  =  AbZPWh  and  A(z)  =  WPtZ~1Aa.  (3.14) 

As  WhTLW  is  always  diagonal,  this  is  also  sufficient  for  (2.11),  with  A(/io,  hK )  nonsingular  unless  H(z) 
has  a  zero  at  an  M-th  root  of  unity.  Call  the  filters  for  the  tradional  DMT  Fj,(z)  and  Gi(z),  i.e. 

[Fo(^),  ■  ■  ■ ,  Fm-i(z)]  =  [l,  z~\  . . . ,  Z-(M-V]  WH,  [Go(z),  ■  ■■GM-i(z)\T  =  z^M~^]  WT. 

(3.15) 
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Then,  because  of  (2.10)  and  (3.14),  we  have  the  following  main  result. 

Theorem  3.1  Under  (1-4)  (2.11)  holds  for  all  hi  with  A(/io,  ■  ■  ■ ,  hK)  nonsingular  unless  H(z)  has  a  zero 
at  an  M -th  root  of  unity,  iff  the  following  two  conditions  hold,  (i)  With  Fi(z)  as  in(3.15),  for  some  complex 
ai  7^  0,  integer  ki  and  permutation  P,  [To (z),  ■  ■  • ,  Fm~i(z)\  —  [a.QZMk° Fq{z),  ■  ■  • ,  Fi£_i(z)\P ; 

and  (ii)  for  some  complex  ^  0,  Gi(z)  —  fitF*  (1  /  z) . 

Consequently,  the  level  of  specral  overlap  between  the  subchannels  is  the  same  as  in  traditional  DMT. 

4  Conclusions 

We  have  shown  that  all  channel  resistant  DMT  systems  are  trivial  variants  of  the  DFT  based  system,  as 
long  as  cyclic  redundancy  is  employed.  This  stands  in  contrast  to  the  results  of  [6],  [3]  where  more  general 
DMT  systems  under  precoding  are  shown  to  be  channel  resistant. 
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ABSTRACT 

In  this  paper  we  present  an  efficient  bitloading  algorithm  that  ap¬ 
plies  to  both  subband  coding  and  multicarrier  communication.  The 
goal  is  to  effect  an  optimal  distribution  of  positive  integer  bit  val¬ 
ues  among  various  subchannels  to  achieve  a  minimum  distortion 
error  variance  for  subband  coding  and  transmitted  power  for  multi¬ 
carrier  communications.  Existing  algorithms  in  the  literature  grow 
with  the  total  number  of  bits  that  must  be  distributed.  The  novelty 
of  our  algorithm  lies  in  the  fact  that  its  complexity  is  independent 
of  the  total  number  of  bits  to  be  allocated. 

1.  INTRODUCTION 

An  important  problem  in  both  subband  coding  and  multicarrier 
communications  is  bitloading.  Specifically,  for  an  iV-subchannel 
system  in  these  problems  reduce  to  general  problem  of  finding  bk 
to 

N 

Minimize:  P(bi, bN)  =  ^  <t>k(bk)  (1) 

k= 1 
N 

Subject  to:  bk  =  B,  bk  £  {0, 1 ,  (2) 

k= l 

where  (f>k  is  a  convex  function,  and  B  is  a  positive  integer.  In 
subband  coding 

Mbk)  =  ak2~2bk  (3) 

where  ak  is  determined  by  the  signal  variance  in  the  A-th  subchan¬ 
nel,  [1]  and  P(bi, 6jv)  is  the  average  distortion  variance,  and  bk 
is  the  bits  assigned  to  the  A-th  subchannel.  Further  ak  increases 
with  increasing  signal  variance.  In  multicarrier  systems 

Mbk)  =  ak  2bk  (4) 

where  ak  reflect  target  performance,  and  channel  and  interfer¬ 
ence  conditions  experienced  in  the  A-th  subchannel,  [11],  [12]  and 
P(bi,  ..,bN)  is  the  total  transmitted  power.  Higher  values  of  ak 
reflects  more  adverse  subchannel  conditions  and/or  stringent  per¬ 
formance  goals;  bk  is  the  the  number  of  bits  assigned  to  each  sym¬ 
bol  in  the  cognizant  subchannel. 

It  is  recognized  that  for  general  convex  functions  cj>k{-),  the 
above  constrained  minimization  grows  in  complexity  with  the  size 
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of  B.  Since  B  can  be  large,  it  is  important  to  formulate  algorithms 
for  which  the  complexity  bound  is  independent  of  B.  The  princi¬ 
ple  contribution  of  this  paper  is  to  show  that  if  one  works  with  the 
special  case  of  (4),  then  such  an  algorithm  is  indeed  feasible.  The 
algorithm  we  provide  in  this  paper  though  formulated  in  the  con¬ 
text  of  (4),  can  be  trivially  modified  to  accomodate  both  (3),  and 
indeed  a  general  function 

4>k(bk)  =  akavbk  (5) 

To  place  this  work  in  context  we  note  the  presence  of  several 
bit  loading  algorithms  in  the  literature.  These  include,  [3],  [4],  [6], 
[8],  [10].  The  two  most  advanced  and  recent  are  [10]  and  [3],  The 
complexity  of  [10]  grows  as  0(N  \og(N))  with  the  number  of 
subchannels,  but  linearly  with  B.  On  the  other  hand  [3]  provides 
a  suboptimal  solution  with  complexity  O(N).  Strictly  speaking 
its  complexity  does  not  grow  with  B,  as  it  restricts  the  maximum 
number  of  bits  to  be  assigned  to  any  subchannel  to  some  B* .  In¬ 
stead  the  complexity  grows  with  B* .  The  assumption  of  small  B* 
is  certainly  problematic  in  subband  coding,  and  even  in  communi¬ 
cations  settings  when  certain  subchannels  experience  deep  fades. 
In  such  a  case  efficiency  may  demand  that  large  number  bits  be 
assigned  to  subchannels  with  more  favorable  conditions.  A  still 
another  contributor  to  the  complexity  of  [3]  is  the  dynamic  range 
of  cti,  which  again  comes  into  play  in  the  presence  of  deep  fades. 
All  other  algorithms  have  run  times  that  increase  with  B. 

By  contrast,  we  provide  an  exact  solution  to  (1,  2),  under  (4), 
whose  complexity  has  an  upper  bound  that  is  determined  only  by  N 
and  is  in  fact  0(N  log  N).  The  role  of  B  is  only  to  induce  cyclic 
fluctuations  in  the  precise  number  of  computations,  and  neither  B 
nor  the  dynamic  range  of  ak,  affects  the  upper  bound  of  the  run 
time. 

The  paper  is  organized  as  follows.  Section  2  recaps  a  result 
from  [13],  that  is  specialized  in  this  paper  to  formulate  the  algo¬ 
rithm  given  in  section  3.  The  complexity  and  proof  of  correctness 
are  provided  in  Sections  4  and  5,  respectively.  Section  6  gives 
simulations  comparing  the  run  time  of  this  algorithm  with  those  of 
[10]  and  [3], 

2.  A  GENERAL  RESULT 

We  now  present  a  general  result  from  [13]  that  solves  (1),  (2)  for 
arbitrary  convex  <f>k( •).  This  result  is  specialized  to  the  cases  of 
(3)  and  (4)  in  subsequent  sections.  Denote  for  A  =  = 

5k(x)  =  <j>k(x)  -  (j>k{x  -  1).  (6) 

The  (j)k ’s  being  convex,  it  follows  that 


4(1)  <  4(2)  <  ...  <  5k(B),Vk.  (7) 

Let  S  denote  the  set  of  smallest  B  elements  of 

r  =  { Sk(x )  :  k  =  1, N,  x  =  1  ,—,B} 

The  following  lemma  from  [13],  gives  an  optimum  solution  to  (1), 

(2). 

Lemma  1  The  optimal  solution  b*  =  [&{,•••,  4v]T  to  problem 
(1),  (2),  is  defined  as  follows 

{  0  :  4(1)  £  S 
bl  =  {  B  :  4(B)  €  S 

{  y  ■  4(y)  £  S,Sk(y  +  l)  i  S 

In  essence  this  lemma  provides  a  conceptual  framework  for 
solving  (1),  (2).  Specifically,  construct  S ,  and  for  each  k,  deter¬ 
mine  the  largest  integer  argument  4  for  which  5k{bk)  is  in  S.  For 
general  convex  functions  <4  the  complexity  of  all  known  solutions 
grows  with  B.  In  the  rest  of  the  paper  we  show  by  example  of  (4) 
that  the  result  when  specialized  to  (4)  leads  to  an  algorithm  that 
does  not  depend  on  B.  A  trivial  modification  of  this  algorithm  can 
be  formulated  for  (3),  and  indeed  (5)  and  is  omitted. 

3.  PROPOSED  LOADING  ALGORITHM 

In  the  special  case  of  (4),  one  finds  that, 

8k{x)  =  ak2x~x .  (8) 

The  first  step  of  the  algorithm  requires  ordering  the  a,,  and 
can  be  accomplished  in  0(N  log  N)  steps.  Henceforth  assume 
without  sacrificing  generality  that: 

Oil  A  OL2  A  ..  A  OiN  •  (9) 

Define  the  sequence: 

k  =  riog2(— )1 4  =  1,2,  ...,7V  (10) 

ai 

with  In+i  =  oo,  where  [a]  is  the  smallest  integer  greater  than 
or  equal  to  a.  The  significance  of  the  integers  l,  is  explained  by 
Lemma  2 

Lemma  2  With  k  defined  in  (10), 

ai2(,_1  <  cn  <  a\2li , 

i.e. 

Si{h)  <  4(1)  <  4 (k  +  1). 

Proof:  From  (10)  we  have  k  =  (log2(^-)].  The  definition  of 
the  ceiling  function  gives  us  the  following  result, 

k  -  1  <  log2(  — )  <  li¬ 
en 

Hence  the  result.  ■ 

Then  the  proposed  algorithm  for  solving  (1),  (2)  under  (4)  is 
given  below.  It  assumes  that  the  ordering  implicit  in  (9),  has  al¬ 
ready  occurred,  and  assigns  bi  bits  to  the  i-th  subchannel. 


Proposed  algorithm 

Step-1 :  Find  the  smallest  k  such  that 

k- 1 

Rk  =  Y^ilk~li)>B  (11) 

i=l 

Then 

4  =  0  V*  €  {k,  k  +  1,  •  •  • ,  N}.  (12) 

Step-2\  Find 

A  =  B-Rk _i  (13) 

r  =  A  mod  (k  —  1)  (14) 

q  =  Adiv(fc  —  1)  (15) 

Step-3:  Find  the  r  smallest  elements  of  the  set 

{4(4-i  —  h),  4(4-i  —  4  ),•••,  4-i  (0)}.  (16) 

In  particular,  with  lPi  such  that  with  lji  £  {1,  2,  •  •  • ,  k  —  1}, 

4* (4-i  -  ki)  <  4i+i(4-i  -  iji+ 1),  (17) 

call 

j  =  {jl,j2,-,jr}  ■  (18) 

If  r  =  0,  J  is  empty. 

Step-4:  For  all  i  £  {1,  2,  •  •  • ,  k  —  1}, 

,  _  f  4—i  —  k  +  q  +  1 
{  4—i  —  k  +  q 

4.  COMPLEXITY 

Observe  that  the  complexity  inplicit  in  achieving  (9)  is  0(N  log  N). 
Determination  of  k  so  that  (11)  holds  requires  at  most  2 N  opera¬ 
tions,  regardless  ofB.  Indeed  one  has,  with 

pi  =  0 

Pn  —  Pn—  1  T"  4, 

Bn  —  {n  1)4  pn—  1- 

The  only  impact  that  B  has  in  the  complexity  of  determining  k  is 
that  for  sufficiently  small  B,  k  <  N  and  the  number  of  computa¬ 
tions  is  further  reduced  to  2 (k  —  1). 

Determining  the  ranking  manifest  in  (17)  is  detrmined  only  by 
r  and  k,  and  is 

0(r  log(fc  —  1))  <  0((N  —  1)  log((V  —  1)). 

Determination  of  r  requires  2  operations,  independent  of  B.  B 
does  affect  the  precise  value  of  r,  which  however  is  no  greater 
than  N  —  1. 

Thus  the  overall  complexity,  is  bounded  by  0(N  log((V)), 
with  B  playing  no  role  in  the  determination  of  this  bound.  The 
only  effect  that  B  has  on  the  overall  complexity  is  to  cause  fluc¬ 
tuations  in  the  precise  number  of  operations,  within  a  range  that  is 
independent  of  B.  To  recap,  these  fluctuations  occur  when: 

•  For  small  B,  k  <  N,  and  finding  k  requires  only  2  (fc  —  1) 
operations. 

•  As  B  changes  r  fluctuates  between  0  and  N  —  1,  and  the 
number  of  operations  required  to  determine  the  smallest  r 
elements  of  the  set  in  (16)  changes. 


if  i  £  J, 
else. 


(19) 


5.  PROOF  FOR  CORRECTNESS 


Proof:  For  all  i  £  {2,  •  •  • ,  k  —  1},  from  Lemma  2,  we  have: 


We  now  show  that  the  algorithm  in  section  3  does  indeed  solve  (1), 
(2),  under  (4).  In  view  of  Lemma  1  it  suffices  to  show  that  the  set 

S*  =  (4(1),  •  •  • ,  4(4),  4(1),  •  •  • ,  £2(62),  ■  ■  ■ ,4-i(4-i)}, 

(20) 

is  such  that 

S*  =S, 

defined  in  section  2.  This  in  turn  requires  the  demonstration  of  the 
following  facts. 

(A)  | S'*  |  =  |S|  =  B,  where  ■  represents  the  cardinality  of  its 
argument. 

(B)  For  all  i,j  £  {1,  2,  •  •  •  ,  N}, 

4(4+ 1)  >  Sj(bj). 


The  first  theorem  proves  (A). 

Theorem  1  With  bi  defined  in  (11-19),  15*1  =  B. 

Proof:  Since  bi  =  0  for  all  i  £  { k ,  k  +  1,-  •  • ,  N},  we  need  to 
show  that 

fc-i 

i=l 

From  ( 1 1  - 1 9)  we  have  that 

k- 1 

J2bi+  E  bi 

i=i  ieJ 

k- 1 

=  r(q  +  1)  +  (fc  -  1  -  r)q  +  y^(4-i  -  h) 

i= 1 

=  A  +  Rk~i 
=  B. 


To  prove  (B )  we  need  an  additional  Lemma. 


5i(bi)  =  ai2lk-1-li+q~1  <  al2lk-'+q~1  =  4(4),  (21) 

as  h  =  0.  Thus  4(4)  is  the  largest  member  of  S*  in  (20). 

From  Lemma  2,  for  all*  £  {1,  •  •  • ,  fc  —  1}, 

4(4  +  1)  =  ai2lk-i~li+q  >  ai2lk-1+q-1  =  <5i(6i).  (22) 

Further,  as  (12)  holds,  we  have  from  Lemmas  2  and  3  that  for  all 
i  G  {/c,  k  -\- 1,  •  •  • , 

5i(bi)  =  ai2lk~1+q~1  <  a^-1  <  ak  =  4(1).  (23) 

In  view  of  (21),  (22)  and  (23),  prove  the  result.  ■ 

Finally  we  prove  (B)  for  the  case  where  r  ^  0. 

Theorem  3  Consider  (10-19).  Suppose  r  0.  Then  (B)  above 
holds. 

Proof:  With  the  indices  jt  defined  in  (17),  we  first  show  that 

Sjr  >  Si(bi)  Mi  £  {1,  •  •  • ,  k  -  1}.  (24) 

In  view  of  (17)  this  is  clearly  true  for  i  £  J.  Now  consider  p  £ 
{{1,  •  •  • ,  k  —  1}  —  J}.  Because  of  (19)  and  Lemma  2, 

4(4)  =  ap2lk~1~lp+q~1 

<  aT2lk~1+q~1 

<  ajl2lk-1~lji+q 

=  4i  (4i ) 

—  4r(4r)> 

where  the  last  inequality  follows  from  (17). 

For  all*  €  {{1,  •  •  • ,  k  —  1}  —  J},  (17,  18)  demonstrate  that 

4(4  +  1)  >Sjr(bjr).  (25) 

Further,  from  Lemma  2  for  all  i  £  J, 

4(4  +  1)  =  ai2lk~1~li+9+1  >  ai2lk~1+q  >  Sjr(bjr). 


Lemma  3  With  U,  k  and  q  as  in  (10-15), 

f  <lk-  Ik- 1  if  r  =  0 
q\  <  h  -  Ik- i  ifr  =£  0 

Proof:  From  (11-15) 

(k-l)q  +  r  <  Rk-  Rk- i 

k  k—  1 

=  ^  ^i)  ^(^-1 

i=  1  i=  1 

=  (k  —  1  )(Zfc  —  Ik- 1). 

Hence  the  result.  ■ 

We  now  prove  (B)  for  the  case  where  r  =  0. 

Theorem  2  Consider  (10-19).  Suppose  r  =  0.  Then  (B)  above 
holds. 


Then  the  result  is  proved  by  observing  from  Lemma  3  that 

5>r(bjr)  =  ajr2lk-'~ltr+q 

<  ajr.2lk~1~lir~1 

<  ai24-1 

<  ak  =  4(1). 


6.  SIMULATIONS 

A  comparison  of  the  performance  of  the  algorithms  of  [10]  and  [3] 
and  the  proposed  algorithm  with  respect  to  the  number  of  compu¬ 
tations  required  is  shown  in  the  figures  1  and  2,  for  the  cases  where 
N  =  32  and  N  =  64,  respectively.  In  implementing  [3],  which  is 
a  suboptimal  algorithm,  the  maximum  number  of  bits,  B*  that  any 
channel  can  be  assigned  is  kept  at  B. 
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Fig.  1.  Runtime  comparisions  of  the  three  algorithms  for  N=32 


Number  of  Channels  =  64 


Number  of  Bits  to  be  allocated  B 


Fig.  2.  Runtime  comparisions  of  the  three  algorithms  for  N=64 


Number  of  computations  needed  for  each  algorithm  to  con¬ 
verge  to  the  optimal  solution  was  calculated  by  assuming  that  addi¬ 
tion,  subtraction,  div,  mod,  multiplication  or  division  of  two  num¬ 
bers  would  need  one  computation  as  would  the  logical  compar¬ 
isons  between  two  decimal  numbers.  The  results  show  that  the 
algorithm  described  in  [3]  is  linear  with  respect  to  B  while  the  al¬ 
gorithm  in  [10]  needs  large  number  of  computations  to  converge 
as  B  grows.  The  number  of  computations  needed  for  the  proposed 
algorithm  is  independent  of  the  change  in  B  the  minor  variations 
seen  are  attributed  to  the  facts  that  for  small  B,  k  in  (1 1)  is  small, 
reducing  the  number  of  computations  slightly,  and  cyclic  fluctua¬ 
tions  induced  by  the  variation  in  r  (see  ( 14))  between  0  and  N  —  1. 
the  sorting  algorithm  whose  convergence  depends  on  the  input  vec¬ 
tor)  and  the  difference  in  the  runtimes  becomes  very  significant  for 
large  B.  Further  the  proposed  algorithm  out  performs  that  of  [3], 
even  when  B  is  small,  and  even  in  [3]  B  =  B*  will  be  chosen. 
This  is  largely  because  of  the  fact  that  the  run  time  in  [3]  grows 
with  the  dynamic  range  of  otk- 


We  presented  an  optimum  bit  loading  algorithm  with  a  run  time 
of  0(N  log  N)  which  is  more  efficient  than  the  ones  existing  in 
the  literature,  in  that  its  complexity  does  not  depend  on  the  total 
number  of  bits  to  be  allocated.  The  improvement  in  performance 
is  very  significant  if  B  is  large  when  compared  to  N. 
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